| JOURNAL OF THE AMERICAN | 
STATISTICAL ASSOCIATION 


VOLUME 43: 1948 


NUMBERS 241-244 


Published Quarterly by the 
AMERICAN STATISTICAL ASSOCIATION 


WASHINGTON, D. C. 


1948 








artnet ves 


(mE 





sBUS. ADM. 


ie Journal of the ™ 1!" % 
AMERICAN STATISTICAL 
ASSOCIATION © 








MARCH 1948 


Statistics and Foreign Policy Willard L. Thorp 1 
History of the Uses of Modern Sampling Procedures Frederick F. Stephan 12 


The Use of Mark Sensing in a Large Scale Testing Program 
Walter L. Deemer, Jr. 40 





‘A Proposed Basic Course in Statistics = George W. Snedecor 53 
Actuarial Estimates for Public Sickness Insurance Plans Abraham M. Niessen 61 
Earnings of Nonfarm Employees in the U.S., 1890-1946 Stanley Lebergott 74 


Direct Determination of Compass Settings for Proportionate Area Pie-Charts 


Herman Lasken 94 
Profile Graphs John V. Spielmans 96 


A Method for Obtaining and Analyzing Sensitivity Data 
W. J. Dixon and A.M. Mood 109 


A Short-Cut Method of Fitting a Logistic Curve 
William A. Spurr and David R. Arnold 


Statistical Methodology Index, No. 11 Oscar Krisen Buros 


BOOK REVIEWS by Joseph Berkson, Paul S. Dwyer, Lester R. Frankel, Milton 
Friedman, T. N. E. Greville, Harry Pelle Hartkemeier, Eugene Lukacs, Philip 
J. McCarthy, Margaret Merrell, Hugo Muench, Haroid Nisselson, Oystein 
Ore, Edgar Z. Palmer, Charles F. Roos, Max Sasuly, William A. Spurr, Victor 
von Szeliski, and H. D. Wolfe 


LETTERS ABOUT BOOKS by C. West Churchman, John W. Dudley, Jr., and 
John W. Tukey 





VOLUME 43 NUMBER 241 





American Statistical Association 


Organized November 27, 1839 
Incorporated 1841 


The American Statistical Association is a scientific and educational organi- 
zation. Its membership is not confined to professional statisticians but includes 
economists, business executives, research directors, government officials, uni- 
versity professors, and other persons who are seriously interested in the applica- 
tion of statistical methods to practical problems, in the development of more 
useful methods, and in the improvement of basic statistical data. Engineers, 
mathematicians, biologists, actuaries, sociologists, psychologists, and representa- 
tives of many other professions are included in the membership of the Association. 


Regular membership 

Student membership 

Introductory membership (for the first dues payment of appli- 
cants under 30 years of age) 

Members subscription to Biometrics 

Associate membership in the Biometrics Section 

Contributing membership 


Subscription rate, $8.00 per annum. Prices for back issues available on request. 


Additional information about the Association and membership application 
forms may be secured from the Secretary, 1603 K Street, N. W., Washington 
6, D.C. 


The Editors welcome the submission of articles and notes for possible publica- 
tion in the JourNat. Manuscripts should be sent to the Editor, Journal of the 
American Statistical Association, 1603 K Street, N. W., Washington 6, D.C. 
(1) Send two or more copies of your manuscript (preferably in separate enve- 
lopes, to avoid danger of loss). (2) Leave one half of the first page of your manu- 
script blank, to be used for instructions to the printer. (3) Attach to the front 
of your manuscript a fifty word outline or summary. Authors who wish sugges- 
tions about the preparation of manuscripts and charts or information about 
editorial policies should address their inquiries to the Editor. Book reviews 
should be sent to Oscar K. Buros, Rutgers University, New Brunswick, New 
Jersey. 





* $4.00 of which is for one year’s subscription to the Journal of the American Statistical 
Association, and 75 cents for one year’s subscription to The American Statistician. 








el 








Se 


Me 


Business AGa» 
Library 


HA 


JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 





VoLuME 43 Marca 1948 NuMBER 241 





ARTICLES 
Statistics and Foreign Policy . . . . . . +. WituarpL. THorp 


eed of om hee of manne mmgeag 5 Procedures 
.  Freperick F. StapHan 


The = of heen Sensing in a elie ies Testing Program . 
Wa ttTeER L. DEEMEr, Jr. 


A a" Basic Course in Statistics. . . . Georce W. SNEDECOR 


Actuarial Estimates for Public Sickness Insurance Plans . 
oe Gee es te . ABRAHAM M. NISSEN 


asians of jepeeren Employees in ae - s., a ‘ 
Srantey LEBERGOTT 


Direct Determination of unageans Settings for Proportionate Area Pie- 
Gpeme . 1 stl ‘ . . . Herman LASKEN 


Profile Graphs. . . . - « «. . . . . Sonn V. SPreLMAns 


A eed for reareiates one Analyzing Sensitivity Data . 
W. J. Dixon anp A. M. Moop 


A een maaee of Fitting a Logistic Curve. 
Wiuuiam A. Spurr anp Davin R. ArNoLp 


Statistical ree Index, No. 11 


BOOK: REVIEWS 
Asay, Gruta, Bevezetis a Statisztika Tudomdnydéba: Rész I . Euc=ne LuKacs 
ALBERT, ADRIAN A., College Algebra . . . . . +. PautS. Dwyer 


Burton, CLEMENT, A Concise Manual of Statistics: With Special Reference 
to the Requirements of Students for Municipal Examination, Second 
Edition . . . . . « « Harry Pevitr HaRtTKEMEIER 


DAHLBERG, GUNNAR, Mathematische peneamenee von Populationen . 
é _-. * & i> *+ + * & OysTEIN ORE 


Dewey, Epwarp R., anv Dakin, Epwin F., Cyne The Science of Predic- 
— Aa se + + ee . MILTON FRIEDMAN 
. Max Sasuty 


Finney, D. J., Probit anne « eet Treatment of the Sigmoid Re- 
sponse Curve. . . Marcaret MERRELL 
ke +e % JosEPH BERKSON 








Gusreetro, Amaro D.,Manual de Estatistica . . . Hugo MuEenca 


Hays, nen An ome * meen re" Edition Haro.tp NISsELSON 
P - < Epaar Z. PALMER 


HEIDEINGSFIELD, Myron §S., and BLanKEeNsHIP, ALBERT B., Market and 
M — Analysts Pe te ae See eee tee a Lester R. FRANKEL 
ee ia Puriie J. McCartuy 


Mortara, Groraio, O Custo de omen do Homem Adulto e Sua Variagao 


em Relagao @ Mortalidade . . . T. N. E. Grevitie 
Scuwena, LorAnp, Statisztika Médszertani t Aeneit . Eugene LuKkacs 
Tompson, G. aane, eed ne a's *, 4 Cnsgeae F. Roos 
. + oe hoe CaS ae H. D. Wore 

Wricxst, WILSON, eateries » joan A renee for Business Manage- 
ment —— , . Wiiiiam A. Spurr 


VicToR VON SZELISKI 


Letters About Books a a a a Joun W. TuKEY 
ey a mk lee Joun W. DupLey, JR. 


C. Wrest CHURCHMAN 


153 
155 
156 


157 
160 


161 
135 


163 
165 


167 
168 


172 
172 
173 








JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Number 241 MARCH 1948 Volume 43 








STATISTICS AND FOREIGN POLICY* 


Wiuarp L. THorP 
Assistant Secretary of State 


O THE CARTOONIST, those who operate in the field of foreign policy 

fall into one of two categories. First are the black-haired, gaunt 
and cunning conspirators whose complicated and devious scheming is 
for objectives intended never to be apparent by methods which are 
sinister and super-top-secret. The other category includes the tea- 
drinking, spat-wearing butterflies who are too vacuous to have an ob- 
jective and whose method is limited to charm, with the “r” relatively 
silent. Unfortunately, neither of these pen-and-ink conceptions even 
hints at the backbone of present-day foreign policy making and imple- 
mentation, namely, careful theoretical and factual analysis by skilled 
technicians. 

An increasing number of major problems of foreign policy are of 
such a character that measurement and magnitude become elements of 
basic importance. It is worthy of note that recently, when a delegation 
from the United Kingdom arrived to diseuss with us a revision of the 
Bizonal Fusion Agreement for Germany, included in its impedimenta 
was a calculating machine. The classic “S” commodity list, beginning 
“shoes and ships,” has been out of date for some time due to the 
lamentable obsolescence of that colorful item known as sealing wax, 
but the new “F” list—iood, fuel, fibres, and fertilizer—has top priority 
on the current foreign policy agenda of most governments and these 
items appear there almost entirely as problems of magnitude. Today 
calories get more attention than kings. The slide rule, the calculating 
machine, and the statistical yearbook have become necessary tools of 
diplomacy. , 

The statistical puzzles arising out of foreign policy problems nat- 
urally are as varied as the universe. They are not limited by subject- 
matter, area, or time. What should be the number of whales permitted 


* Presidential address delivered at the 107th Annual Meeting of the American Statistical Associa- 
tion on December 29, 1947. 
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under the International Whaling Convention to be caught during the 
next whaling season in order to maintain a stable whale population, 
keeping also in mind the world shortage of fats and oils—a neat prob- 
lem in the vital statistics and caloric content of whales? What are the 
proper cost-of-living-adjustment allowances to permit U. S. Govern- 
ment representatives in various countries to maintain approximately 
equivalent living standards—an index number problem with peculiar 
difficulties not merely for price level and foreign exchange reasons, but 
because of differing national customs of hospitality and patterns of 
protocol? What was the probable amount of destruction to American 
property in Italy arising out the War, and for which Italy must make 
partial payment under the Peace Treaty—a problem of statistical in- 
ference from quite fortuitous and incomplete sample data? What is the 
fair proportion between the United States and India for the shipment 
of raw cotton to Japan under present controlled trade conditions—a 
problem in which the accepted guiding principle requires the finding of 
a “representative base period” out of a most abnormal series of years? 
What is the relationship between the availability of tobacco products 
in the Ruhr and the production of coal and steel—a problem of psycho- 
logical measurement since the proposal as made by several Senators 
rests in the allegation that tobacco products are particularly effective 
as incentive goods? What amount of goods sent to Russia under the 
Lend-Lease program was presumably unused and undestroyed at the 
end of the War and thus subject to a negotiated settlement—a problem 
in war-time and peace-time property life-tables, and attrition and de- 
preciation rates? This random list of a few problems may serve to es- 
tablish the inference of the presence of statistics in the State Depart- 
ment, but the record will be clearer if we consider two illustrations in 
somewhat fuller detail. 

In a world where there are desperate shortages of commodities, the 
problem of allocation has become a matter of prime importance. 
Countries have become competing purchasers, and even, in a few 
tragic cases, competitors for relief assistance. The shortages are wide- 
spread and severe, and for many commodities there are few countries 
with an exportable surplus. Foodstuffs are in this category and no gov- 
ernment with any claim to responsibility can look away while its people 
are hungry. The State Department has probably received more aide- 
memoires, notes, memoranda, and formal and informal visitations 
from Prime Ministers, Ambassadors, foreign technicians, and even self- 
appointed representatives, concerning the subject of food allocations 
than any other single topic during the last two years. The White House 
too has had distinguished callers on the same subject. 
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The development and application of the concept of equitable food 
allocation, based on a careful examination of requirements and avail- 
abilities, was done first by a small international committee, then by the 
International Emergency Food Council and is now in process of being 
taken over by the Food and Agriculture Organization, one of the spe- 
cialized agencies of the United Nations. This international body makes 
recommendations to the supplying countries as to the proper distribu- 
tion of their surpluses and these recommendations are followed with 
little variation. 

The problem is a most complex one. The basic unit for comparing 
food levels is the calorie, but unfortunately the simple definition in 
Webster that a calorie is the amount of heat required to raise the tem- 
perature of one kilogram of water one degree Centigrade has not been 
so exact and indisputable when applied in the field of nutrition. Two 
caloric tables for valuing foods are in general use now, one by the 
U. 8. Army and one by the International Emergency Food Council. 
There are at Jeast half a dozen other tables used in various parts of the 
world. The two principal] ones vary in caloric content from ten to fifteen 
per cent. Similarly, the effort to measure various types of grain in 
“wheat equivalent” opens the door to a thorough state of confusion. 
It is self-evident that when statisticians of many countries meet to- 
gether to discuss a given problem, a primary requirement is that there 
be some common measure for setting down the facts and this has been 
a major task in the food field. Although this does eliminate one standard 
area for professional controversy, there will always remain enough 
other factors of disagreement to permit full self-expression. 

Obviously, the first information required for making international 
allocations is that pertaining to the requirement and the indigenous 
supply in each country. If these can be satisfactorily determined, the 
import requirement follows merely by subtraction. At once it is neces- 
sary to remark that in many countries where industrial production is 
lagging the production of statistics is also below pre-war both in qual- 
ity and quantity. Unfortunately, statistical organizations have been 
disorganized at the same time that the items to be measured have 
been subjected to wide variation. Even as basic a datum as population 
must be approached through estimation. Substantial movements of 
peoples have taken place, in addition to the abnormal effects of war 
on both birth and death rates. And none of the devastated countries 
has had the time or energy to take a post-war census, necessary to 
obtain a relatively secure benchmark. Similarly, agricultural produc- 
tion in most|cases is measured much less accurately than before the 
war by the surviving statistical agencies in the various governments. 
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And even less certain, both abroad and at home, are the important 
estimates of wheat and coarse grains consumed by man and beast on 
the farm and thus not moving into the available supply. However, on 
the basis of the pre-war picture for which more reliable data are avail- 
able, and the records for the past two years, including ration levels, 
the amounts imported, and the apparent stocks on hand, estimated im- 
port requirements for the current year, quarterly and by months, are 
worked out regularly by the experts. 

On the other side of the equation, the possible export surplus, three 
of the chief exporting countries have regularly indicated their avail- 
abilities as best they can. Argentina, which provides about twenty 
per cent of the world’s exports of grain, has not participated in the 
effort to plan the most effective distribution of the available supply. 
This unfortunate situation has been met in part by the other exporting 
countries through adjustment of the allocation to offset supplies ob- 
tained from the Argentine. Russia’s exports, which have been limited 
in amount and have gone to very few recipient countries, have also not 
been subject to international allocation. 

But even the direct facts on immediate requirements and supplies 
are not enough to solve the problem. Wheat, of course, is only one of 
many foods. It happens to be the cheapest form in which calories can 
be purchased in substantial quantities. In allocating wheat, the inter- 
national committee must consider what other foods are available in 
the country with a wheat deficit. Obviously, it is quite proper in the 
light of the world shortage to cut down on wheat shipments to coun- 
tries which have a fair amount of other foods. Even then, however, one 
must have some regard for the necessities of balanced diets. Further- 
more, food habits and food requirements must be considered. Even in 
times of great need, a people does not change its way of eating over 
night. New foods are not easily introduced, even to a starving people, 
and established prejudices are surprisingly tenacious. For example, 
corn is not regarded as a proper food for human beings in a number of 
European countries, nor zre potatoes in the normal diet of Italy, while 
rice is consumed much more than wheat in Cuba. Differences are not 
merely the result of taste. Countries with long winters have different 
requirements from those in the tropics. And even in our own experience 
as an occupying power, we have recognized that a much higher caloric 
requirement is needed for Germans than for Japanese to maintain the 
same health level. So the statistician, in figuring the amount of wheat 
to allocate to any deficit nation, must have a clear picture of that coun- 
try’s historical eating habits and be familiar with its preferences and 
prejudices. 
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Finally, in considering the amount which is to be permitted to come 
from abroad, the allocations must not work in such a way as to punish 
the country which brings its maximum to the market-place, or all enter- 
prise and initiative in the direction of improved collections from the 
farms will be destroyed. Conversely, there must be some penalty for 
failure to use the indigenous supplies most efficiently. 

Since the beginning of the allocation procedure, it has always been 
true that the screened requirements for foodstuffs for all the deficit 
countries have totaled to substantially more than the availabilities, 
and here the really painful job begins—the effort to determine where 
the requirements can be cut with the minimum of hardship. The figures 
of each country are reviewed again and again, and there are many 
conferences to explore various aspects of the situation more thoroughly. 
Finally, the allocations are announced. At least, the process has made 
everyone aware of the limitations on supply and the urgency of the 
demands from other countries. 

These random comments about the allocation machinery may make 
the task appear exceedingly complex. But the fact remains that the 
job must be done. These formidable calculations, aimed to take into 
consideration both the overall requirements and supply situation, and 
the peculiar circumstances in each case, are the only hope of providing 
some basis of fairness and equity in the distribution of scarce things to 
people who are in desperate need of them. 

There is no question but that living for millions of individuals for 
the next few years at least, will have to continue under rationing and 
allocations of critical, searce commodities. The people in these coun- 
tries know that death from starvation is just as permanent as death 
from bombing. They know that allocations and rationing are protec- 
tions to their lives—that the rationing of milk, for instance, may cut 
down the number of fancy dishes served in fine hotels, but it does get 
the needed food to more mothers and infants for whom it is an essen- 
tial. The international allocations in the same way are an effort more 
nearly to equalize the burden of the shortage on the people of the vari- 
ous countries, not leaving the distribution solely to ability to offer the 
highest bid or to the appeal of political sympathy, obligation or reward. 

The decision having been made to disregard economic bargaining or 
political discrimination and to place allocations on an objective basis, 
the key to this whole process becomes the little-heralded statisticians 
—both those who must present the case for their countries, and those 
who must screen the competing claims and bring them into a reason- 
able relationship with each other. The day and night work and worry is 
theirs, but they have built in large part upon the work of other statis- 
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ticians whose work had gone before. I hesitate to think how impossible 
it would be toe handle this problem had it not been for the continued 
collection and analysis of agricultural statistics and nutritional data in 
many countries for many years. 

As a second illustration may I speak briefly about the European 
Recovery Plan? The last six months have seen a most difficult and 
complex statistical undertaking in Washington—the examination of 
the requirements for European recovery and the study of the capacity 
of the American economy and other economies to carry the European 
deficit in the meantime. This task has absorbed the full energies and 
capacities of many experts in many government agencies. The only 
relief for the central group directing the project was temporary when 
in a lighter moment they decided to call themselves the Technical Wiz- 
ards on the European Recovery Program, or the TWERPS, for short. 

Anyone familiar with Washington during wartime can easily visualize 
the time and energy required to develop the details of a plan involving 
sixteen countries and a four-and-a-quarter year period. I remember in 
the late twenties being told by a Russian economist about the tre- 
mendous efforts required and the manpower devoted to drawing up the 
five-year plan. Last summer, I saw French economists and statisticians 
in a state of near exhaustion from working on the so-called Monnet 
Plan. No one should regard an undertaking of this kind lightly. There 
have been no days, and even at times no nights, of rest. 

This project stems back, of course, to the suggestion by Secretary 
Marshall that the countries of Europe get together, examine what their 
requirements will be over a period of time sufficient to permit them to 
put their economies on a s2if-supporting basis, determine how much of 
these requirements they can meet separately and collectively, and thus 
indicate what additional help is needed to accomplish the program. 

Sixteen of the Western European nations met in Paris last summer 
and, in an incredibly short period of weeks, drew up a program on which 
they could all agree. Undoubtedly, this agreement was possible because 
the essential elements of European recovery are beyond dispute—that 
production must be substantially increased, sound currencies must be 
established, and the restrictions on trade must be reduced. Their na- 
tional req) irements were presented and assembled. After some slight 
screening ind reduction where ihe requirements were clearly beyond 
the possil ility of supply, the so-called deficit was calculated. This was 
all incorporated in the Report of the Committee of European Econom- 
ic Cooperation which, together with a number of technical annexes, 
was sent to Washington in September. 
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Work was already well under way in Washington by that time, par- 
ticularly with reference to the capacity of our own economy to meet 
such foreign demands and the effect of such an operation upon our own 
economic operations. But the review of the European plan has proved 
to be a most complicated undertaking. Covering a period of four-and- 
one-quarter years, the program for each of sixteen countries and West- 
ern Germany had to be consistent as between its internal program and 
its export and import programs. Tor the total of all countries, the re- 
quirements from abroad and the availability of supply had to balance. 
Similarly, for the various individual commodities, demand and supply 
had to balance. For each country, its balance of payments had to be in 
equilibrium. And when supplies could not be obtained in the United 
States but could be found in other supplying countries, these prospec- 
tive sources had to be determined and incorporated into the pattern. 
In other words, the total pattern had to balance not merely as to an 
overall figure, but by commodities, by country physical requirements, 
and by country balances of payments. Similar patterns had to be pre- 
pared for each year within the period. And finally, the attempt to 
achieve both commodity and the balance-of-payments estimates in turn 
had to be broken down by currency areas in order to indicate the na- 
ture of the deficit with the dollar area. 

The number of arithmetical calculations which have gone into these 
estimates probably total more than a million. Five time periods are 
covered—the last quarter of the present fiscal year, April 1 to June 30, 
1948, and the four successive years until June 30, 1952. Twenty-three 
areas were involved beginning with the sixteen countries which par- 
ticipated at Paris, the dependent areas of the United Kingdom, Bel- 
gium, France, Netherlands, Portugal, and Western Germany in three 
parts—the bizonal area, the French Zone excluding the Saar, and the 
Saar itself (since the Saar territory may shortly be incorporated eco- 
nomically into France). Twenty-six commodity groups were selected 
for particularly intensive treatment, of which a number such as iron 
and steel have to be built up from a series of separate categories—pig 
iron, scrap steel, iron ore, crude and semi-finished steel, tin plate, steel 
sheet and other finished steel—and for projections of the volume and 
value of trade covering movements of commodities among the par- 
ticipating countries, between the participating countries and the United 
States and other Western Hemisphere and other non-participating 
countries in turn. After the volume of this trade was derived, it had 
to be multiplied by prices to obtain values. While July 1, 1947 prices 
were used, it was necessary to ascertain prevailing prices in different 
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areas of the world on that date since the prewar assumption that 
prices for internationally traded commodities tend to be equal the world 
over has lost its validity under the conditions which prevail today. 

The task of combining the figures provided by the commodity 
committees into a coherent system from which balance-of-payments 
estimates could be derived fell, as it happened, to the Department of 
State. An early courageous attempt was made to grapple with it by 
assembling the adding and calculation machines of which the Depart- 
ment can boast only a sparse and scattered population, and by amass- 
ing at the same time the clerical assistance necessary to man or 
“woman” the machines. The attempt was futile. The traditions of the 
Department of State and its personnel training are oriented more to- 
ward the accurate and careful phrasing of a nemorandum than the 
well-multiplied, checked and proven statistical table. Resort was 
necessarily had to punch cards, and automatic sorting and addition. 
It may be that the Foreign Office of 10 years hence will boast a full 
line of international business machines with operators in 24-hour at- 
tendance. As of today, it was necessary to work the calculations in on 
the graveyard shift at Census, Bureau of Labor Statistics, and finally 
in the Department of National Defense. 

One further difficulty was to establish the price assumptions to be 
used for the future period. Here the crystal ball was particularly cloudy. 
The Paris Conference had used the prices then current, July 1, 1947, 
as the basis for both exports and imports for the first year (1948) and 
had then assumed that European export prices would remain firm, 
while import prices would decline by 7} per cent in 1949, 10 per cent in 
1950, and 123 per cent in 1951. The American reviewers have felt 
that the only way out of this dilemma was to present a range. Actu- 
ally, the basic calculations were made in July 1, 1947 prices, but the 
totals have been adjusted globally to meet different sets of price as- 
sumptions. 

For the first fifteen months, all exports from Europe and imports 
to Europe from the Western Hemisphere except U. S., are calculated 
at 5 per cent above July 1, 1947 prices, while U. S. and non-Western 
Hemisphere shipments to Europe are 7.5 per cent above July 1, 1947. 
In the later years, the calculations on the high price assumption hold 
the price level constant with the level for the initial period, while the 
calculations for the low price assumption are based on a marked de- 
cline, particularly for the items imported to Europe. These different 
assumptions explain the presence of a range in the total requirement for 
the program of $15.1-$17.8 billions. 
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The projections are, of course, not blueprints which can be followed 
during operation. Actually, this is a sketch rather than a blueprint. 
Its purpose is to provide Congress and the public with as accurate an 
estimate as possible of what the program may in fact turn out to be, 
and the general magnitude of the requirements from abroad, if the 
European recovery program is to be accomplished. If some of the 
items are not available in the quantities indicated, there may be sub- 
stitution. If availabilities increase, prices will fall, or if they decrease, 
prices will rise. In either event, the dollars involved will tend to be 
more nearly constant than the constituent elements. And with so 
many commodities and countries, we can fall back on the protection of 
all statisticians, the hope that the deviations will be somewhat com- 
pensatory. 

In the original undertaking, five sets of questionnaires were drawn 
up by the European group to obtain information on food, fuels, ma- 
chinery, iron and steel, transport, and balance of payments. This 
information has been available to us in Washington, and to it has been 
added further information which we requested, plus the vast reservoir 
of knowledge accumulated in our government. However, it is unfor- 
tunately clear that there are some serious gaps in the basic information 
required. 

Even with the most complete information possible, there could be 
no assured results. The most that we cxn do is to achieve consistent 
and logical results from as reasonable assumptions as can be made. As 
I have already pointed out, assumptions had to be made as to price 
levels. Another uncertainty is created by the necessity to estimate 
crops. Should we assume that the weather will continue to be as un- 
cooperative with the farmer in Europe as it has been since the end of 
the War? And there are many other unknowns, such as at what point 
the processes of commodity hoarding will cease and money will be used 
again as a store of value. 

It is clear that the Recovery Program will have to be a dynamic 
and flexible operation. As was true during the War, programs will have 
to be changed from time to time as conditions change, both as to coun- 
tries and as to commodities. To achieve the most effective use of the 
available resources, there will need to be continued and detailed sta- 
tistical recording of progress made, and forecasts of the short and long-. 
run prospects. The injection of statistical methods into foreign policy 
is therefore no temporary expedient—but promises to be a continuing 
necessity. : 

Certain conclusions are now apparent concerning our capacity to 
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reduce the problems which I have been discussing to exact measure- 
ment. The first is the common complaint of the statistician—that we 
do not have adequate data. This is, of course particularly true of the 
countries where the effects of the War are still felt so severely. On the 
American side, we know much too little about the statistical quality 
and relevance of much of the foreign data. It is clear that these inter- 
national projects require cooperation and understanding between the 
statisticians of all participating nations. And time can be used up 
most rapidly if one is skipping about among long tons, short tons, and 
metric tons, not: to mention bushels, quintals, hundredweights, barrels 
and Imperial gallons. 

On this point, that of the development of statistical data and the 
effort to achieve greater uniformity, there is much that the United 
Nations can do, supplemented by the private international statistical 
organizations. The establishment of a Statistical Commission by the 
United Nations and the international statistical meetings held in 
Washington last September may give us some-encouragement. This is a 
long-time job—it calls for continuous support and stimulation. If 
much is to be accomplished, the statisticians in the United States must 
take the lead. We must continually be prepared to demonstrate that, in 
this modern world, many problems can be faced properly and solved 
economically only when measurement is respected as a fundamental 
characteristic of the analysis. 

But beyond these points, there is a continuing frustration because 
too few relationships have been reduced to calculable form. It is ob- 
vious that planning really requires both cost and market data, that the 
requirements in the form of materials must be readily related to ca- 
pacity, that labor supply and working capital requirements must all be 
part of such consideration. Here the statistician can make endless 
contributions, substituting detailed analysis for the rule of thumb or 
the experimental approach. In these matters, too much is at stake to 
be careless or casual. Nevertheless, our answers often contain far too 
much of the “rough estimate” and too little of the careful calculation. 

The same problems arise in connection with the use of statistics in 
the foreign policy field as in any other—the struggle to do an honest 
and objective job, and the difficulties of convincing others that such 
is the result. There are always those who would prefer to be guided by 
emotion, prejudice, and preference. It is not at all surprising that 
suspicions which continually cross national boundaries are not stopped 
by siatistics. 

Nevertheless, the fact remains that problems must he handled; 
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there must be answers; and statisticians, both here and abroad, can 
contribute much to finding the best answers. As a professional group, 
we have a clear-cut responsibility. First, we must continually strive to 
improve our performance through developing our own capacities and 
improving the raw material with which we work. But beyond that, we 
must persistently try to persuade others to take an objective view of 
the facts, to use to the full the techniques and capacities which we have 
to offer. We cannot promise to solve all the world’s problems. Many of 
them are not problems of measurement at ail. But where possible the 
utilization of statistics, the reference to objective criteria, and the 
effort to measure before committing oneself to a line of action—these 
are all ways in which rational men approach problems. And foreign 
policy should be no exception to the rule. 

Unfortunately, foreign policy is an area where it is all too easy for 
emotions and prejudices to be aroused, where problems all too often 
get subjective treatment. One approach towards international under- 
standing is to expose problems to the facts whenever possible—and 
they must be accurate and dependable facts. The battle between 
prejudice and analysis is our battle, in which we must supply much of 
the ammunition. Our responsibilities and opportunities have greatly 
increased in the international field. We not only can help in the de- 
velopment of knowledge about the world we live in, but we can actu- | 
ally contribute substantially to that international understanding which 
is so greatly needed ia the world today. 











HISTORY OF THE USES OF MODERN 
SAMPLING PROCEDURES* 


FrepErick F. StePHANt 
Princeton University 


I, INTRODUCTION 


0 =2RN sampling practice is the result of combining statistical 
M theory with scientific knowledge about the material that is being 
sampled and with experience that has been gained through the use of 
various sampling techniques in surveys and experimental work. Its 
history stems from many roots and its applications branch out into 
many fields of science, commerce, manufacturing, agriculture, educa- 
tion, and government administration. In spite of its wide range of 
usefulness, sampling practice has been neglected in the training of 
statisticians, in the textbooks and treatises, and in the planning and 
analysis of most experiments and studies [1]. However, like Cinderella, 
it has risen from neglect to a position of well-deserved importance. 

The history of sampling practice provides a useful background for 
the discussion of present theory and applications. Difficulties beset any- 
one who attempts to trace the developing uses of sampling because they 
are scattered throughout many branches of science and technology and 
are described, if at all, in subordinate portions of reports and articles 
whose titles provide no hint of what they may contain on the subject of 
sampling. Hence, this paper not only will touch each development very 
briefly and inadequately but no doubt it will miss entirely a number of 
highly important facts. (The author will appreciate suggestions for 
correction and addition.) The principal emphasis will be on the steps 
by which the present stage of progress in applied sampling was reached 
and no attempt will be made to present a comprehensive picture of 
present-day uses or to appraise the relative value of each contribution 
to the accumulation of technical knowledge. 

There are two dangers that both author and reader must avoid in an 
historical review: (a) they may read into the records more than was 
actually there and (b) they may assume that nothing existed that is 
not given in the records. I hold to the view that the principles and 


* Presented at the 25th Sessicn of the International Statistical Institute, a constituent part of the 
International Statistical Conferences, Washington, D. C., September 6-18, 1947. This paper will also 
appear in the Proceedings of the Conferences, which will be published in the near future. 

t+ This paper was prepared in connection with studies of sampling under the Committee on the 
Measurement of Opinion, Attitudes and Consumer Wants of the National Research Council and Social 
Science Research Council]. 
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technical practices of sampling emerged rather gradually from notions 
that were simpler, sometimes inaccurate and confused, and often not 
recognized by those who held them as being of any special importance. 
Hence the records are fragmentary and there were doubtless many in- 
stances in which principles were developed by two or more workers in 
ignorance of the other’s work. I would not say that the same discovery 
was made “independently” for there is always a flow of common ideas, 
sometimes rapid and sometimes slow, in which all share. Hence it 
should be understood that the particular examples that will be men- 
tioned are part of a larger development of practice and theory which 
connected them in various known and unknown ways and often in- 
fluenced profoundly their characteristics and effectiveness. 

Finally, greater attention will be given to the use of sampling in 
large-scale surveys rather than to its use in the laboratory or in small- 
scale (local) experiments. 


II. EARLY EXAMPLES OF THE USE OF SAMPLES 


The earliest examples of sampling procedures are to be found in cer- 
tain very ordinary human activities. The common practice of taking 
a small part or portion for tasting or testing to determine the charac- 
teristics of the whole precedes recorded history and is one of the roots 
from which sampling methodology stems. Stirring and mixing before 
taking the sample is a prototype of randomization. The efforts of 
scientists and their predecessors to draw conclusions about the laws 
of nature from what they could observe in their immediate environ- 
ment was a sampling process. Astronomy, perhaps the oldest of the 
sciences, started with the moon, the larger planets, and the stars that 
one could see from positions near the equator or north of it. Even to- 
day it is limited to those heavenly bodies that can be photographed 
with the best available instruments. 

All empirical knowledge is, in 2 fundamental sense, derived from in- 
complete or imperfect observation and is, therefore, a sampling of ex- 
perience. An unusually interesting example, not unrelated to modern 
population research, can be found in Halley’s selection of mortality 
statistics in Breslau to form the basis of his life table. From this sample 
of one city he drew general conclusions pertaining to “the mortality of 
mankind” [2]. This was in 1693. Two centuries later, Sir John Lawes 
likewise used the annual record of wheat yields at Rothamsted on 5 
plots totalling 33 acres to estimate the change in yield per acre from 
1852 to 1879 for all of England and Wales [3]. While these examples 
were both statistical, all scientific observation, whether statistical or 
not, is based on sampling. 
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In spite of the widespread practice of sampling in commerce and in- 
dustry and its utilization in scientific research, official statisticians have 
usually set as their goal the complete counting of a population and have 
sought to avoid any use of sampling or any inference beyond the bare 
description of that population at the time it was counted. Still they 
have been compelled to compromise, in the face of practical difficulties, 
on something less than complete coverage and often also on something 
less than adequate accuracy and quality in the information recorded. 
Such compromises have been common in the collection of many kinds 
of statistics, particularly those usually collected by voluntary reporting 
such as employment, wage rates, prices, and crop yields. It is also 
found in the collection of data that ought to be complete in order to 
satisfy certain non-statistical purposes, as for example in the registra- 
tion of births. Often the official statistician, faced with great difficulties, 
opposition, and indifference on the part of the public and with only 
weak powers to support his efforts, could only compromise or resign 
his post. The result of the compromise was often a very crude and un- 
satisfactory form of sampling which strove to get as much information 
“as possible” and accepted what it got, instead of attempting to con- 
trol the selection by some of the means that were available. 

Interestingly enough, sampling has generally preceded the establish- 
ment of regular censuses. Prior to the first British Census in 1801 the 
size of the population of England was not known very accurately and 
various speculations were made and estimates prepared and disputed. 
Malthus projected his famous theory of population, and Adam Smith 
his Wealth of Nations, before the establishment of periodic national 
censuses. At least as early as 1754, estimates of population of England 
were made from the number of houses on the tax list plus a rude esti- 
mate of cottages not taxed, the total of dwellings being multiplied by a 
somewhat arbitrary factor of 6 persons per dwelling [4]. Other esti- 
mates were based on the reported number of baptisms, marriages and 
burials. 

In 1800 Sir Frederick Morton Eden estimated the population of 
Great Britain at 9,000,000 using sample data on the average number of 
inhabitants per house as well as the number of births. The first census 
of Great Britain in 1801 confirmed his estimate. Somewhat similar 
methods of estimation have been used between censuses in the U.S.A. 
as recently as the last decade [5]. 

About 1765 Messance, and 1778 Moheau, published very carefully 
prepared estimates for France based on enumerationof population in 
certain districts and on the count of births, deaths and marriages as 
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reported for the whole country. The districts from which the ratio of 
inhabitants to births was determined truly constituted a sample. 
Laplace prepared similar estimates in 1802 from enumeration of 30 
departments of France and the reports of births, deaths, and marriages 
from 1799 to 1802. This followed a plan he published in 1786. He made 
a remarkable step forward in attempting to measure the precision of 
his estimate and announced that the odds were 1161 to 1 that it was 
not in error by more than 500,000 inhabitants or 12 per cent of the total 
population. Even though the method of estimate was crude and the 
measure of precision not wholly valid, Laplace’s effort was much more 
successful than the complete census of France that was attempted at 
the same time. A similar procedure had been employed in 1784-91 by 
the famous chemist Lavoisier to estimate the number of horses, cattle, 
sheep, and pigs, as well as the area under cultivation [6]. 

The foregoing examples suggest that modern sampling procedure 
might have developed at least a century sooner than it did if it had 
received more attention from the scientists of the day. Only the 
officials of statistical bureaus and other posts continued to concern 
themselves with the problems of measuring population, and they were 
preoccupied with difficulties in the classification and interpretation of 
data, in the administration of statistical bureaus, and in the important 
problems of trade, finance, industry, agriculture, public health, ete. 
for which statistical data were needed. Hence they favored complete 
censuses or the closest approach to them that was feasible. 


III) EXAMPLES OF DEVELOPMENT OF NEED FOR 
EFFICIENT SAMPLING METHODS 


It would be interesting to explore the development of statistical 
work in all its many ramifications and inquire about the places in 
which modern sampling methods could have been used to good ad- 
vantage. That is beyond the possibilities of a paper such as this. In- 
stead examples will be given from the beginnings of four general lines 
of statistical work in which efficient sampling methods were needed: 
(a) agriculture crop estimates, (b) economic statistics of prices, wages, 
employment, etc. (c) statistical phases of social surveys and health 
studies, and (d) public opinion polling. The more recent developments 
of the use of sampling in these and other fields will be traced in suc- 
ceeding sections. 

(a) Agricultural crop and livestock estimates: At the present time a 
tremendous variety of statistics are collected in the United States on 
acreage planted in each principal or special crop, the estimated yield as 











16 AMERICAN STATISTICAL ASSOCIATION 


judged at successive dates during the growing season, actual yields, 
numbers of livestock, equipment, farm labor, marketing, and other 
aspects of agricultural production. Other countries collect similar 
statistics with variations in scope, detail and frequency. In the U.S. as 
in many countries, the methods used for this purpose include periodic 
censuses, special surveys, voluntary reporting by selected respondents, 
and records produced in connection with taxation, marketing and 
foreign trade. 

This vast system had its beginnings about a century ago. The first 
U. S. Census of Agriculture was taken in 1840. At the same time col- 
lection of agricultural statistics was begun by the Patent Office. 
Monthly reports were collected and published by the editor of the 
American Agriculturist in 1862 and in 1866 regular reports on acreage, 
condition of crops, yield, and livestock were begun in the newly estab- 
lished Department of Agriculture. Annual reports of prices were 
added in 1867. Thereafter the system developed into an extensive 
organization of agents and voluntary reporters that increased from 
4000 in the late 70’s to 300,000 in 1931 and a somewhat larger number 
' . e ° . 

now [7]. (The present system is described in a paper presented by Dr. 
C. F. Sarle at the International Statistical Conferences.) The methods 
of sampling were quite crude, many of the reports were simply based 
on the respondent’s judgment about conditions in his locality and the 
list of respondents was built up of farmers, agricultural agents, members 
of the staff of agricultural colleges, etc. who were willing to serve with- 
out pay and were located at places widely scattered over the country. 
The system was a compromise between the excessive cost and slow re- 
turns of complete enumeration on the one hand and the wholly un- 
satisfactory alternative of relying on reports issued by private specula- 
tors or going without current information, on the other. Various at- 
tempts were made to improve the reliability of the statistics that re- 
sulted from these reports but they continued to be biased and of 
limited accuracy. Nevertheless they have been of great value to 
farmers and to other users of the statistics and they are the principal 
source of current data on agriculture. Here as elsewhere the introduc- 
tion of modern sampling methods came Jate but is now making great 
contributions to the improvement of the statistical information in this 
field. 

(b) Economic statistics: The history of economic statistics is a long 
and detailed story which can only be sketched here in inadequate out- 
line [8]. Again, as for population and agriculture, there were early at- 
tempts to make the best of scanty and fragmentary material. Censuses 
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and special investigations have provided increasingly valuable data but 
at long intervals and in inadequate detail. 

In many countries, economic statistics derived primarily from the 
records produced by taxation and customs duties. This was true in 
the United States. Statistics about subjects that were not provided by 
the system of taxation had to be obtained by other means. More than 
a century ago the collection of reports of prices, wage rates, hours of 
work, employment, and production was begun with voluntary report- 
ing by employers and occasional special surveys as the principal 
methods. Massachusetts established a statistical bureau in 1869, 
Pennsylvania in 1870, and other states at later dates; the Federal 
Government set up a Labor Bureau in 1884. These bureaus developed 
staffs of field investigators to collect data at monthly or other periods 
but relied heavily on mail questionnaires for many of their principal 
series. This continues to be the prevalent procedure although the 
establishment of social insurance systems and employment exchanges 
has provided other sources of statistics on earnings and employment. 

Similar systems of collecting reports are operated by other agencies 
concerned with production, trade, and finance. These inquiries have 
been based on simple sampling procedures such as the use of a scat- 
tered group of voluntary reporters, except for certain relatively recent 
surveys. A serious effort was made, for example, to get wholesale 
prices “in representative markets” and to get labor data by sending 
agents into various districts with a list of employers from which they 
could choose what they believed to be a representative group, but the 
techniques of selecting a representative sample had not been developed 
as formal procedures. Here again modern sampling procedures have 
been introduced late but are making noteworthy contributions, in con- 
junction with other improvements, to the accuracy and value of these 
statistics. 

(c) Social surveys and health surveys: The distinction between social 
surveys and certain related kinds of inquiries is difficult to make. Some, 
like LePlay’s nomographic studies, are very intensive observations of 
a few individuals or families; others include thousands of cases but less 
intensively and with more attention to statistical analysis. Williams 
and Zimmerman listed 1500 studies of family living in the U. 8S. and 
elsewhere, made prior to 1935 [9]. Booth’s Survey of London Life and 
Labour, Rowntree’s survey of poverty in York, the reports of the 
Immigration Commission in 1907, and the Pittsburgh Survey of the 
Russell Sage Foundation are examples. While complete and thorough 
investigation of the problems was the ideal to which such surveys 
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aspired, they were necessarily limited to a selected city, and to some- 
what fragmentary data on pertinent questions by the sheer impossibil- 
ity of a comprehensive canvass. Other surveys were made under the 
inspiration of their example but progress in methodology came slowly 
and as in many other investigations the accuracy of the results was 
seldom tested. 

Sampling has been used extensively in studies of poverty and un- 
employment. During the depression of 1873-79 Carroll D. Wright used 
police in 19 cities and wrote to assessors in 375 towns throughout 
Massachusetts inquiring about the number of unemployed. His esti- 
mate of 28,508 corrected the current assertions that 200,000 or 300,000 
workers were out of work [10]. Similarly in 1873 the number of unem- 
ployed in New York City was estimated on the basis of reports from 400 
volunteer visitors of the Association for Improving the Condition of 
the Poor. These procedures were, of course, crude and inaccurate but 
they reflected a praiseworthy desire to measure what was being dis- 
cussed in still less accurate terms by important public leaders. In 
1893, unemployment was estimated indirectly by the decline in fac- 
tory employment from its highest figure in a series derived from a 
sample of reporting employers constituting 70 per cent of the produc- 
tion in the state [11]. In 1893, estimates were collected by mail from 
respondents in 119 cities by Bradstreet. In New York City, Chicago 
and other cities, police and health inspectors surveyed samples of 
homes or factories [12]. In the depression of 1914-15, many surveys of 
unemployment were made, notably by canvass of the workers insured 
by the Metropolitan Life Insurance Co. through its agents in 29 cities 
[13] and by canvass of 104 city blocks and 3703 additional tenement 
houses by the Bureau of Labor Statistics using about 100 tenement 
house inspectors. Both agencies repeated their surveys in New York 
City eight months later [14]. 

In the depression of 1921-22, somewhat similar methods were used to 
estimate unemployment and, in general, while the results were far from 
satisfactory, there was improvement in the identification of the un- 
employed as distinct from other needy groups and increased detail in 
the information collected. Especially noteworthy as an advance beyond 
previous practice was the survey conducted in Columbus, Ohio, by 
Frederick E. Croxton and Mary Louise Mark using 100 university 
students to canvass three districts within the city selected by the presi- 
dents of the Federation of Labor and the Chamber of Commerce as 4 
“fair sample of the wage-earning population of the city” and including 
approximately 10 per cent of the wage-earning population. The survey 
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was repeated annually from 1922 to 1925 [15]. A similar procedure was 
used in Philadelphia by J. Frederic Dewhurst and Ernest A. Tupper 
who had school attendance officers canvass 166 small districts selected 
to provide an accurate cross-section of unemployment [16]. This survey 
was repeated annually during the 30’s and provided several kinds of 
data not previously available. The Columbus procedure was used in 
1929, 1930, and 1931 in Buffalo, N. Y. by Croxton [17] and in 1931 in 
Syracuse, N. Y. by John N. Webb [18]. 

Among numerous health surveys made in the U. S. from time to 
time, several examples may be mentioned that resemble the preceding 
surveys in their methods of sampling. In 1921, the U. S. Public Health 
Survey selected Hagerstown, Md., as a conveniently located and 
“fairly typical small city in the East not greatly influenced by im- 
migration” in which to study sickness and fertility by periodic visits 
to a sample of households [19]. In 1929-30 the Committee on Costs of 
Medical Care made many sample studies, including one on the in- 
cidence of illness among 9000 families [20]. The U. S. Public Health 
Service and Milbank Memorial Fund studied sickness among those 
elements of the population that had borne the brunt of the depression 
by selecting severely effected districts in 10 localities, omitting the 
slums, Negro, and well-to-do sections [21]. 

(d) Public opinion polls: The practice of surveying public opinion 
emerged from simple but obscure beginnings in the “straw vote” 
conducted by newspapers to sense public reactions to candidates and 
obtain human interest stories by interviewing the “man in the street.” 
Even before 1900 the New York Herald collected pre-election reports 
end estimates from all over the United States and attempted to fore- 
cast the outcome of elections. In 1904 it polled 30,000 registered voters 
in New York City. In 1905 the Chicago American and Chicago Journal 
took a poll during the mayoralty campaign and the Columbus Dispatch 
started a long series of polls by taking one in the state election of 1906. 
In 1912 papers in Boston, Chicago, “incinnati, Denver, Los Angeles 
and St. Louis conducted a nation wide poll in 37 states. Four years 
later, this group of papers repeated its presidential election poll and 
another poll was taken by the Hearst newspapers. The Rexall drug 
store chain took a nation wide poll in 1920 using its stores as centers to 
collect the ballots. These and other efforts were outdone in size by The 
Literary Digest which started polling in 1916 and in 1920 mailed out 
11,000,000 ballots to persons whose names were taken from telephone 
directories and other lists. Two years later it conducted a poll on pro- 
hibition. Other polls were taken at the rate of about one every year and 








20 AMERICAN STATISTICAL ASSOCIATION 


a half. Robinson found that more than 86 straw polls were taken in the 
presidential campaign of 1928. Although these polls employed quite 
crude methods, they were successful in many instances because a 
moderate error only affected the prediction of the winning candidate 
when the election was close and because among many attempts some 
might be expected to turn out well. The sources of error in straw polls 
were analyzed in detail by Robinson in 1932 [22]. 

Closely related to the polls were market research studies and con- 
sumer surveys conducted by business concerns, publications, and ad- 
vertising agencies. The results and methods of these studies received 
less publicity than the polls but they offered even greater opportunities 
for sampling than the straw votes. There was also a growing interest 
in public opinion research among political scientists, sociologists, and 
others which led to scientific interest in the improvement of the tech- 
nique of opinion polling [23]. 

The foregoing discussion was limited to uses in which the sampling 
methods were quite simple, indeed often crude and inefficient, and not 
based on explicit considerations of probability. The dividing line is not 
sharp between them and some of the other surveys that will be included 
in Jater sections but they represent fairly well the pioneer stage in the 
emergence of modern sampling practice. This stage might also be 
illustrated by examples of experiments, surveys, and studies in other 
fields in which sampling in some form was used. 

While, in these early instances, the sampling procedures were simple 
and usually employed uncritically with no great attention to accuracy 
and representativeness, it should be noted that the problems of ob- 
serving and recording data were almost always far more serious than 
the problems of sampling. Modern sampling procedures, had they been 
available and had they been applied effectively, would have permitted 
the use of smaller samples and made possible in most cases more care- 
ful fieldwork. Through these developments and those in other fields of 
statistical study, there arose a great opportunity for the use of better 
sampling procedures that has been exploited only in part up to the 
present time. 


IV. DEVELOPMENT OF THE USE OF THE SIMPLER RANDOM AND 
SYSTEMATIC SAMPLING TECHNIQUES 


Much of modern sampling practice rests on processes of selecting 
individuals at random or according to certain systematic procedures. 
These processes may be employed as providing a simple method for 
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selecting a sample or they may be used as parts of a more complex 
sampling scheme. 

Random sampling utilizes the devices of games of chance or other 
methods that assure tc each unit in the source from which the sample 
is drawn an equal probability before the draw that it will be included 
in the sample. It has the advantage of facilitating certain applications 
of the theory of probability to the analysis and use of the results. 

Although the theory of probability was well established in the 
eighteenth century, its applications to the practical drawing of 
samples were delayed until the twentieth. Applications to games of 
chance and lotteries can be excluded from consideration since they were 
not concerned with estimating the characteristics of a population from 
a sample. Likewise, applications to the interpretation of data as having 
been produced by a random sampling process from a hypothetical 
universe can be excluded since they did not involve the actual practice 
of drawing samples. Also trials with dice or cards designed to “verify” 
known probability laws or merely for expository purposes are not 
examples of sampling practice. Doubtless there are many instances that 
have escaped the author’s notice, but the earliest instances he has 
found are Bowley’s sampling of a list of bonds and their interest rates 
in 1906 using the final digits in one of the tables in the Nautical Alma- 
nac to make the random selection and Student’s testing of the ¢-dis- 
tribution in 1907 by drawing cards from a large receptacle. Random 
sampling was facilitated by Tippett’s tables of random numbers 
published in 1927 [24]. The practical importance of random selection 
slowly gained recognition, especially in connection with more complex 
procedures that were developed after 1920 and will be discussed in the 
next section. 

Systematic sampling employs a simple rule of counting cases in some 
convenient order and selecting for the sample every nth case, or of 
using some similar pattern of selection such as taking material at 
measured intervals or drilling samples out of a large mass of meta] by 
use of a template that locates the holes. More complex rules or patterns 
may be used and the starting point may be selected by a random choice 
or special rule. The theory of probability can also be applied to the 
results of systematic sampling under certain conditions. Bowley and 
others have interpreted it as a case of stratified random sampling and 
estimated sampling errors accordingly. 

Systematic sampling has long been in use in taking samples of ore 
from the face of a vein in a mine or samples of metal from a pig or ingot. 
The practice of “cruising” a forest and sampling trees at uniform 
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intervals has been well established in forestry for many years [25]. 

Systematic selection was utilized by A. N. Kiaer in his survey of 
Norwegian workers in 1895. As a means of facilitating special tabula- 
tions from census schedules, it was used in a study of family data in the 
1900 Norwegian census, a study of marriage in the Danish census of 
1901, and a housing study for Oslo in 1913-14 [26]. 

Kiaer’s work apparently had little influence on surveys in the U. 8S. 
or in Europe. Apart from the discussions he initiated in the Bulletin 
of the International Statistical Institute from 1896 to 1903, it was 
mentioned only briefly by Edgeworth in his Presidential address before 
the Royal Statistical Society in 1912 and in the U. S. by W. B. Bailey 
and F. 8. Chapin in 1906 and 1920 respectively [27]. It had a greater 
effect in certain continental surveys [28]. 

In 1909 Bowley included a brief chapter on Sampling in his Ele- 
mentary Manual] of Statistics making general references to sampling in 
commerce, mining, and industry, but not to specific sample surveys. 
However, in 1912 he took every twentieth working class household in 
Reading in a study of poverty and his associates did the same for three 
other cities in 1913 [29]. He computed probable errors of sampling and 
recognized other sources of error in the interviewing, definitions, and 
process of estimating. This study was followed by a systematic sampling 
of census schedules in 1915, and by similar surveys of Liverpool (1930) 
and Merseyside (1931) by Caradog Jones, Southhampton (1927f) by 
Ford, and London (1929) by the London School of Economics under 
Bowley’s direction [30]. 

John Hilton made a series of studies of workers ir the unemployment 
insurance system beginning in 1923 [31]. They were selected systemati- 
cally from the files of the Labour Exchanges. In addition to the in- 
formation in the records some data were obtained by interviewing 
workers who came to the exchange. Certain deviations from strictly 
systematic selection introduced biases which were reduced by sub- 
sequent improvements of the method. Hilton found a sample of only 
one per cent quite satisfactory to meet the practical administrative 
and policy-making purposes for which the studies were made. The re- 
duction in expense that resulted from sampling such a small proportion 
of the records was indeed impressive. Oddly enough the method was 
not imitated to any great extent by other government bureaus. 

Another example of the great usefulness of sampling to get quick 
inexpensive tabulations was furnished by the Japanese who took a 
sample of one in 1000 from their 1920 census to get data quickly after 
the earthquake of 1923 [32]. Subsequent checking when the usual 
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tabulations were completed verified the accuracy of the sample results. 
Further examples of sampling census schedules for special tabulations 
are Day Monroe’s study of Chicago families in the 1920 Census and 
R. F. George’s study of British workers in the 1931 Census [33]. 

In spite of this successful experience, the use of random and sys- 
tematic sampling procedures in statistical work made slow progress. 
At the Rome Session of the International Statistical Institute in 1925 
a resolution was adopted recommending the use of sampling for 
statistical purposes with appropriate precautions as to its representa- 
tiveness, mathematical statement of the precision, and full description 
of the methods employed. The reports that were submitted by the 
commission that drafted the resolution presented evidence of the use- 
fulness of sound sampling procedures but both the reports and the 
resolution had less immediate effect on statistical practices than might 
have been expected. Eight years later, in 1934, Neyman revived the 
discussion of these reports ina notable paper before the Royal Statisti- 
cal Society. 

In America, serious attention was given to problems of sampling 
methodology by several committees of the Social Science Research 
Council. There was a strong current of interest among the sociologists 
including Chaddock, Ross, Ogburn, Stouffer, Lundberg, Dorothy 
Thomas, Stephan and others [34]. Margaret H. Hogg, who had worked 
under Bowley’s direction on some of the British surveys, came to 
America to the staff of the Russell Sage Foundation and there made a 
critical study of employment and unemployment statistics. In an 
article in the Journal of the American Statistical Association she made 
a strong plea for rigorous methods of sampling and cast doubt on the 
value of surveys such as those that had been made in Philadelphia and 
Buffalo, in which the sample was selected by judgment rather than 
random procedures. She was equally concerned about the adequacy of 
the classification of workers into the various categories of employment 
and unemployment and the analysis of the data in terms of significant 
questions. In the spring of 1931 Miss Hogg made a survey of unem- 
ployment in New Haven, Connecticut, partly for the purpose of testing 
the practical difficulties of applying a random sampling method and 
also developing better schedules and statistical categories for unem- 
ployment surveys [35]. 

Two very important general developments affecting the use of 
sampling in the United States occurred in 1933. One was organization 
of large-scale work projects for the unemployed under the national 
programs of the Federal Emergency Relief Administration (1933-35), 
















































24 AMERICAN STATISTICAL ASSOCIATION 


Civil Works Administration (1934-35), and Work Projects Administra- 
tion (1935-40). The second was the enlistment of many leading 
statisticians from the universities and business in the reorganization of 
government statistical work and administration of emergency agencies 
[36]. The statistical needs of the government increased tremendously as 
it took active steps to meet the problems of the depression and under- 
took various New Deal programs. Since the American statistical system 
is decentralized, it became necessary to establish a Central Statistical 
Board to coordinate and regulate the statistical activities of many 
agencies engaged in collecting data for their own purposes or for gen- 
eral use. The Board was largely an outgrowth of the Committee on 
Government Statistics and Information Services, an advisory group 
organized by the Social Science Research Council and the American 
Statistical Association at the request of the Secretaries of Agriculture, 
Commerce, Interior and Labor to assist them in overhauling the statis- 
tical work of their Departments. 

The great variety of statistical work done by the regular agencies, 
the National Recovery Administration, and other emergency agencies 
far exceeds the scope of this paper. Only a few large scale surveys can 
be described briefly as examples of those of their projects that were 
based on sampling; many smaller and more specialized studies were 
also undertaken. A number of difficulties prevented these projects from 
embodying fully the best sampling procedures then known and from 
exhibiting the range of uses to which sampling methods could be put. 
Most of the work was started with inadequate time for planning and 
preliminary trials. There were, curiously enough in a period of unem- 
ployment, severe shortages of persons technically competent to super- 
vise the work. Moreover the field workers and office staffs were drawn 
largely from the unemployment relief lists and worked under restric- 
tions as to hours and rates of pay. In spite of these difficulties the re- 
sults contributed greatly to many very important programs and pro- 
vided detailed statistical information where there had been little or 
none before. 

(1) The Financial Survey of Urban Housing was taken in conjunc- 
tion with the Real Property Inventory, a complete canvass of housing 
in May 1934 covering 60 cities. The Financial Survey was taken by 
interviewers who visited all the families in one block out of every ten 
in larger cities and one in seven in smaller cities. Additional schedules 
were obtained by mail from families in another four blocks out of ten. 
The survey resulted in a 1200 page report of detailed financial data 
and was unique as an early instance of the use of sampling procedure 
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for obtaining special information as part of a single complete canvass 
[37]. 

(2) A series of research studies of workers on relief and related sub- 
jects were undertaken by the FERA and WPA in selected cities and 
counties from 1933 to 1940 using a variety of methods and samples. The 
localities were selected in part by judgment of their representative- 
ness of different types of situations that were to be compared, in part 
by administrative considerations of availability of supervisors, non- 
interference with other projects, etc. Systematic sampling was used ex- 
tensively within the localities for many of these studies and the samples 
were appraised critically in the reports. One of the studies was an 11 
per cent national sampling of workers on relief in March 1935, with 
proportions varying from state to state, providing very detailed tabu- 
lations by age, occupation, and education. The accuracy of the sample 
results was verified by comparison with subsequent complete tabula- 
tions and a table of sampling errors was given [38]. 

(3) The Study of Consumer Purchases was conducted in 1935-36 by 
the Bureau of Labor Statistics in 32 cities and the Bureau of Home 
Economics in 19 small cities and 206 villages or rural counties [39]. 
The study was a work relief project and other agencies participated in 
the planning and analysis. In selecting the localities, available data on 
their economic characteristics were used to achieve a high degree of 
representativeness; within each locality families were selected sys- 
tematically from city directories, the schedules of the Real Property 
Inventory (New York City), the 1934 WPA Census of Chicago, and 
similar lists. The smaller areas were canvassed completely. From the 
first simple schedules obtained for 783,000 families a second sample 
was selected for interviewing about details of income and expenditures. 
This sample was designed to provide adequate numbers of families 
in certain relatively rare categories without taking more families in the 
more prevalent categories than were needed for comparisons between 
the various types and kinds. The sampling procedure was complicated 
by the necessity of interviewing in such a manner that if the work were 
terminated at any stage in the process families that had been visited 
would still be a representative sample. This project far exceeded any 
previous study of its kind in size and complexity. The procedures 
and problems have been described rather fully in the reports and else- 
where. They follow a basic plan that was prepared by the Social Science 
Research Council in 1929 [40]. 

(4) The National Health Survey was conducted by the U. S. Public 
Health Service in 1935-36. It covered 83 cities in 18 states with sampling 
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ratios varying from 1 in 38 for New York City to complete canvass of 
the smaller places. The sampling was done by dividing each city into 
small areas of less than 1000 persons each using the 1930 Census 
enumerators’ maps as the basis, and then drawing a systematic sample 
of these small districts for complete canvass. The Survey supplied a 
great amount of data on illness and medical care. Its schedules were also 
tabulated in great detail by the Social Security Board to obtain cross 
classified information about family composition and characteristics for 
actuarial studies. 

(5) The Michigan Census of Population and Unemployment was 
taken early in 1935 by the State Emergency Welfare Relief Commission 
as a WPA work relief project [41]. The sampling procedure involved 
complete enumeration of all cities between 3000 and 40,000 inhabit- 
ants and a 20 per cent random sample of sma'er places and rural 
townships. Cities of more than 40,000 inhabitants were sampled by 
taking all dwellings with certain predesignated house numbers accord- 
ing to a scheme developed by S. A. Stouffer. The reports not only pro- 
vided detailed data on unemployment in various parts of the state, but 
also unusual information on migration that was subsequently analyzed 
by John N. Webb and Albert Westefeld [42]. 

(6) The Minnesota Income Study was a work-relief project spon- 
sored by the Minnesota Resources Commission and operated in 1938 
and 1939 using a complete system of stratified sampling with a sys- 
tematic selection of households in urban areas and a random selection 
of two-square-mile areas in rural districts [43]. 

(7) The New York City Youth Survey was made by the Welfare 
Council of New York City in 1935 taking every hundredth household 
from the lists of the 1934 Real Property Inventory supplemented by a 
similar sample of dwellings built after the Inventory [44]. It was also 
a WPA work relief project. All members of these households between 
16 and 24 years of age were interviewed about their leisure activities, 
employment, schooling, and other personal data. 

(8) The Continuous Work History Sample is a 4 per cent sample of 
workers covered by old age insurance maintained by the Social Se- 
curity Board in connection with its records of earnings and contribu- 
tions to the insurance scheme [45]. Beginning with 1937 all reported 
earnings and related information are compiled for this group of 
workers. They were selected by use of certain combinations of digits 
in their account numbers. The sample provides annually distributions 
by age, sex, race and taxable wages that are needed for actuarial pur- 
poses as well as useful for general economic studies. 
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(9) The 1937 Enumerative Check Census of Unemployment was 
conducted by the Director of the National Unemployment Census 
who had been ordered by an act of Congress to undertake a voluntary 
census of the unemployed by mail [46]. The Check Census was designed 
to determine what proportion of the unemployed had actually regis- 
tered and thereby provide more accurate estimates of the volume of 
unemployment than had been available previously. The sample was a 
systematic selection of al] households in 2 per cent of the postal carrier 
routes in the U.S.A. About 18 per cent of the population, not served by 
carrier delivery, was excluded but was taken into account in the es- 
timates. The 510,000 households in the sample were visited by postal 
carriers to obtain data of employment and unemployment and a special 
count was made of the number of workers and unemployed persons 
for the routes that constituted the sample. From these data it was 
determined that 71 per cent of the unemployed had registered and es- 
timates were prepared for separate classes of workers with correspond- 
ing calculations of the sampling errors of the estimates. The analysis 
of the estimates and their errors was especially noteworthy. 

The foregoing examples exhibit progress in utilizing sampling tech- 
nique for statistical surveys of exceedingly important national prob- 
lems. They reflect the trend toward greater use of detailed statistical 
data and research as a basis for determining policies and establishing 
programs on a city and state wide basis as well as nationally. They 
are impressive for the size of the samples taken, but there were also 
many smaller samples of comparable importance. In two ways, they 
were a result of the depression: (a) in the urgent demand for statistical 
data generated by the emergency and by the New Deal programs, and 
(b) in the funds and relief personnel that were made available. How- 
ever, these influences tended toward haste and inadequate prepara- 
tion, toward the use of less efficient sampling schemes in order to meet 
problems of supervision and the restrictions on expenditures for travel 
by interviewers, and toward types of inquiries that were suitable for 
the kinds of interviewers that were available. Problems of cost and 
economical operation were considered but were affected by the primary 
purpose of providing useful work for unemployed white collar workers. 
This tended to discourage intensive analysis or research studies not of 
a relatively routine character since the amount of employment provided 
in relation to the kind and amount of supervision required was smaller 
for such projects. Finally sampling problems received only secondary 
emphasis because the other problems connected with the surveys were 
much more serious and time consuming. The results of these surveys 
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were of very great value and to a large extent this was due to the use 
of sampling procedures. 

Coupled with the direct effect of depression problems in stimulating 
the use of sampling was the trend of thinking among statisticians and 
social scientists who, during the late 20’s and early 30’s, became in- 
creasingly interested in applying error theory to the analysis of time 
series and other data [47]. This trend of interest was well represented 
among many of the men who participated in the work of the Commit- 
tee on Government Statistics, and recommended in 1933-35 greater 
and better use of sampling procedures in government statistical work 
for such purposes as (1) checking the coverage, accuracy, and complete- 
ness of censuses, (2) testing the representativeness of current reporting 
systems (crops, cost of living, payrolls, employment, production, etc.), 
(3) supplementing the regular schedules in censuses to relieve them of 
overloading, (4) obtaining prompt national and regional figures on 
such subjects as unemployment, and (5) making very detailed intensive 
studies in which the emphasis is on the analysis of relationships [48]. 
The Committee also recommended that the staff of the Central Statis- 
tical Board and the Bureaus of Agricultural Economics, Census, and 
Labor Statistics include expert advisers on sampling procedure whose 
formal professional training in the technical phases of sampling was 
comparable to that of chemists, physicists or biologists in the govern- 
ment service. These agencies did move to accomplish what had been 
recommended. They experimented with trial surveys of unemploy- 
ment, agriculture, construction, retail prices, and basic studies of the 
problems involved in developing practical sampling systems [49]. 

By this time sampling practice had outgrown the simpler methods of 
random and systematic sampling and was developing complex sampling 
systems specially designed to fit the nature of the population that was 
being sampled, the costs and administrative factors, and the principles 
of efficient design that had evolved in scientific experimentation and 
industrial production. 


V. EXPERIMENTAL COMPARISON OF METHODS AND DEVELOPMENT OF 
COMPLEX SAMPLING PROCEDURES 


Modern sampling practice utilizes a variety of devices and methods 
which have been tested experimentally and incorporated into sampling 
theory. The aim of modern practice is to select those methods which, 
when combined in the most appropriate manner, will constitute a 
sampling system that is economical, convenient, and accurate in the 
situation for which it was designed. Among the devices available for 
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the construction of such a sampling system are: (a) the technique of 
selecting items at random, (b) the determination of the kind of sampling 
unit that will have the most desirable properties, (c) subdivision or 
stratification of the population, in an advantageous manner, (d) the 
use of variable sampling proportions, (e) subsampling and multistage 
sampling in which the sample is itself sampled in turn, (f) drawing two 
or more units from each ultimate subdivision to permit estimation of 
the sampling error from their differences, (g) use of information pro- 
vided by variables that are correlated with the one that is being stud- 
ied, and many other procedures. 

It is difficult to find the original source of many of these techniques 
but it is clear that they were, to a large degree, crystallizations of no- 
tions suggested by common sense, customary practices, and practical 
problems encountered in sampling. Discovery of their importance and 
success in combining them into efficient sampling schemes has required 
a clear understanding of their properties and a considerable amount of 
ingenuity. Beyond that, careful experimentation has been necessary 
to test their relative merits under various conditions. Their develop- 
ment has occurred primarily in connection with agricultural experi- 
mentation, mass production industry, and large-scale surveys. 

(a) Agricultural experimentation: One of the first steps in the ex- 
perimental comparison of sampling schemes was the study of hetero- 
geneity in the fertility of a field and the consequent correlation of 
adjacent portions. In 1910 and 1911, very important work was re- 
ported by “Student,” Wood and Stratton, and Mercer and Hall 
[50], leading to the conclusions that in field trials there is always some 
experimental error, that it can be reduced by taking a large number of 
small plots and, that, in the comparison of two varieties or treatments, 
it can be reduced by taking these plots in pairs, adjacent to each other, 
and using one plot out of each pair for each variety or treatment. The 
latter device was adopted so as to take advantage of the correlation 
between adjacent areas. Plots of different sizes and shapes were com- 
pared using the results of a careful harvesting and measurement of man- 
golds and data on 400 reports of duplicate plots reported by other in- 
vestigators. These results led to similar experiments in England, Amer- 
ica, and elsewhere [51]. Various systematic sampling schemes were 
proposed and tried experimentally, both for locating plots and sub- 
sampling the plots in order to reduce the labor of harvesting them 
completely. In 1929, Clapham and Wishart published the results of 
experimental comparisons between certain random and systematic 
methods of sampling a field of potatoes or cereals and analyzed the 
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factors that contributed to the error of the estimate of yield [52]. In 
the same year Smith and Prentice published results of a field study in 
which they took samples of soil and then subsampled for laboratory 
examination. In their analysis of the results they estimated the increase 
in error due to subsampling [53]. R. A. Fisher, whose advice and 
methods of statistical analysis had been followed in many of the experi- 
ments in the 20’s and later, developed complex systems of experimen- 
tation of great efficiency and revolutionized experimentation [54]. 
Yates and Zacopanay published in 1935 a thoroughgoing analysis of the 
efficiency of sampling a field, including comparative labor costs [55]. 
A number of manuals were published expounding the principles and 
practice of field experimentation. This work of the agronomists and 
statisticians had direct applications to large-scale surveys. 

(b) Mass production industry: Another source of complex sampling 
procedures was the development of inspection and quality control in 
mass production. In 1923, engineers at the Western Electric Com- 
pany, manufacturers of telephone equipment, began to apply prob- 
ability theory to the inspection of telephone exchange equipment. In 
the refinement of manufacturing processes and progressive adoption of 
more rigorous specifications, the cost of inspection had become very 
burdensome and sampling procedures offered possibilities of reducing 
the amount of inspection without serious loss of control over the 
quality of the product. Different sampling schemes were compared on 
the basis of their probability-of-acceptance curves and tolerances were 
established for the proportion of defective pieces in a lot. Thereafter 
sampling procedures were developed in more complex forms to achieve 
a reduction in the amount of inspection to the minimum necessary 
under the conditions of manufacture and the specifications to be met 
[56]. Technical problems more or Jess peculiar to the telephone industry 
made it the pioneer im the applications of mathematical statistics is 
industrial processes, ‘ncluding not only inspection but quality control 
and probability analysis of operating problems under Rorty (as early 
as 1903), Shewhart, Dede, Molina, Fry, and others [57]. 

(c) Communication af sew techniques: Those developments in agri- 
culture and engineering had both direct and indirect effects on sampling 
survey practice. They provided principles of design and contributed to 
the growth of applied mathematical statistics. Still there were many 
practical problems ac ohstac!:s that delayed the immediate extension 
of the methods develope for ield trials and manufacturing to large- 
scale surveys. One of the cbstacles was the relative lack of communica- 
tion between statisticians engaged in different types of work. An im- 
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portant step toward facilitating communication was taken by the Royal 
Statistical Society in 1933 when it formed the Industrial and Agricul- 
tural Research Section and published the proceedings of meetings of 
the Section as a Supplement to the Society’s Journal. The Section 
brought together mathematica] statisticians, engineers, agriculturists 
and others interested in practical applications of newly developing sta- 
tistical theory and stimulated greatly interest in sampling. E. 8. Pearson 
presented a notable paper on “Problems of Industrial Sampling” at a 
meeting of the Section early in 1934 [58]. A few months later, Jerzy 
Neyman read before the Society a comprehensive theoretical and criti- 
cal paper on stratified samplirg and purposive selection. He cited 
among several examples a survey of the working population in Poland 
for which he had designed a stratified random sample of districts [59]. 
A number of other papers by various authors, published from time to 
time in the Supplement, included discussions of sampling practice. 

In this period interest in sampling practice was heightened in Great 
Britain by Shewhart’s visit in 1932 (a factor in the formation of the 
Industrial and Agricultural Research Section) and in America by the 
visits of Fisher, Neyman, Yates, and Cochran (1936-38) as well as by 
American students who studied in England. 

(d) Large-scale surveys: Another root of complex sampling practice 
is to be found in the progressive improvement of large-scale surveys of 
crop yields and acreages, the labor force, and other economic and : 
social facts. A pioneering survey in India in 1923 by J. A. Hubback 
went almost unnoticed until Mahalanobis discovered and republished 
it in 1946 [60]. Hubback undertook to estimate the yield of rice by ob- 
jective measurements in the fields instead of by the methods based on 
the judgment of farmers and brokers. He used as sampling units very 
small areas measured off by a wooden frame in every field his agents 
found being harvested as they walked a fixed route from each selected 
center. Hubback was acquainted with “Student’s” and Bowley’s sam- 
pling work. His report emphasized the advantage of random selection 
of many very small areas as economical and necessary to obtain a valid 
estimate of error. 

In the period 1928-31, J. O. Irwin examined the British crop es- 
timating system and made recommendations for improving the 
sampling methods [61]. Subsequent work was done by Cochran. In the 
United States, W. F. Callander, J. B. Shepard, C. F. Sarle and others 
began to plan in the late 20’s for a sample of farms to be used for various 
studies of acreage, production, livestock, etc. Sarle made comprehen- 
sive appraisals of the methods used to obtain farm-price data in 1927 
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and crop-yield estimates in 1932 and proposed a program of studies 
aimed at testing and improving the various sampling operations then 
in use [62]. Several studies of this type were made by state agricultural 
statisticians. 

In July, 1936, the Bureau of Agricultural Economics and Iowa State 
College held a conference on sampling attended by many of the state 
agricultural statisticians as well as staff from Washington and Ames 
[63]. R. A. Fisner who was lecturing at Ames participated in the discus- 
sions. The conference set forth a great variety of practical problems 
and outlined specific projects for research to help solve them. Some of 
these probjects were started at Ames in cooperation with the Bureau, 
some were undertaken in Washington. The program has been con- 
tinued up to the present time with additions and modifications. It has 
resulted in a remarkable series of reports and developments of meth- 
odology [64] as well as the establishment of a sampling staff within the 
Iowa State College Statistical Laboratory at Ames under G. W. 
Snedecor, A. J. King, and R. J. Jessen. : 

During the same period, P. C. Mahalanobis and his associates at the 
Indian Statistical Institute in Calcutta developed large-scale sampling 
methods to a high degree of technical perfection and utilized them for 
surveys of jute acreage, agricultural finance, famine conditions, radio 
listening, and many other subjects. The theoretical system associated 
with these practical applications has been set forth fully by Mahala- 
nobis in his publications and lectures [65]. Mahalanobis has made careful 
experimental measurements of the human errors in sampling surveys 
by use of “interpenetrating samples” and other means. 

From 1936 on the Research Division of the U. S. Bureau of the 
Census under C. L. Dedrick made extensive studies of problems in 
population sampling, utilizing the 1930 Census returns. The results 
have not been published. In 1939, Stephan, Deming, and Hansen de- 
signed a sampling scheme that was used in the 1940 Population Census 
to draw a 5 per cent sample totalling 7 million people. This method was 
based on experimental tabulations of 1930 census data and on a varia- 
tion of systematic sampling [66]. The sample was used to obtain pre- 
liminary estimates of regular census statistics as well as tabulations of 
the special questions that were asked only of the people in the sample. 
Subsequent complete tabulations verified the accuracy of the estimates. 

During the same period experiments were conducted in a number of 
cities and towns by the Research Division of the Work Projects Ad- 
ministration, under H. B. Myers and J. N. Webb, to test out methods of 
conducting sample surveys of unemployment. On the basis of these 
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trials a monthly nation-wide survey was started in December 1939 
utilizing a subsampling procedure in approximately 60 counties. Its 
scope was enlarged during the war to include housing and labor ques- 
tions and in 1942 it was transferred to the Census Bureau where it was 
developed further as the Monthly Survey of the Labor Force [67]. 

With the Monthly Survey and other sampling surveys required by 
the war agencies, the Census Bureau had a sizeable program of sampling 
work and developed, under Morris H. Hansen, a staff of sampling spe- 
cialists who designed sample schemes, and studied their performance in 
practice. Data from the 1940 Census for very small areas were made 
available for use in sampling and Sanborn maps showing every street 
and building in a large part of the cities in the United States were ob- 
tained [68]. In addition, airplane maps were obtained for a large 
part of the United States and procedures were developed for employing 
them in sampling. 

A further step was taken in the development of sampling facilities 
when the Bureau of Agricultural Economics established at Iowa State 
College in 1943 a project to design and prepare a Master Sample of 
farms for use in its surveys [69]. This Master Sample is a stratified se- 
lection of very small areas based on airplane photographs and highway 
maps. The Census Bureau joined in its development and had it ex- 
tended to include other than farm areas. 

These most recent developments in complex sampling procedures 
have been described very inadequately and many other recent develop- 
ments have been omitted for want of space. Other speakers on the pro- 
gram of the International Statistical Conferences will present the cur- 
rent stage of theory and examples of present day practice. 


VI. DEVELOPMENT OF USE OF JUDGMENT SELECTION AND 
CONTROLS IN SAMPLING 


All attempts to obtain data by sampling involve some choices based 
on the purposes of the study, knowledge of the material, and experi- 
ence with the methods of observing and recording similar data in previ- 
ous studies. For example Bowley’s choice of Reading for his survey in 
1912 and Clapham’s choice of certain fields at Rothamsted for his ex- 
periments on sampling design were not random selections. Yet the con- 
clusions drawn from such samples as these are commonly generalized 
to apply to other periods of time and other places. Some investigators 
have depended entirely on their knowledge and judgment to make the 
fina] selection of the items that constitute the sample, others have in- 
sisted that judgment should only be used to subdivide the population 
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into strata and units of sampling, the actual choice of the sample being 
made at random. These views have been presented in various forms 
and the differences have not yet been fully resolved. Research is in 
progress to determine some of the facts and develop the principles 
that are involved in appraising the performance of various sampling 
procedures based on these conflicting views. 

Jensen, in his report in 1926, described a number of surveys made by 
“purposive” sampling in which districts were selected which, taken to- 
gether, matched the population in certain characteristics known from a 
previous census. The theory of purposive selection was developed by 
Bowley in 1926, and discussed by Jensen in 1928 [70]. Neyman com- 
pared purposive and stratified random sampling in 1934 giving special 
attention to experience reported by Gini and Galvani. Mangus de- 
scribed a variety of purposive sampling of counties used in studies of 
the FERA in 1934 [71]. 

The development of public opinion polling and market research in- 
volved many interesting problems of social psychology and survey 
procedure that fall outside the scope of this paper, but it also led to the 
use of many different sampling schemes, prominent among which were 
those termed “quota control” sampling. The earliest work was done 
with samples chosen to meet the convenience and very limited resources 
of the research personnel. In this respect they were like most other sur- 
veys conducted at the time. Robinson has reviewed in detail the history 
of straw votes and opinion polls to 1932 [72]. He also analyzed The Lit- 
erary Digest poll which, in spite of its great size was badly biased by its 
method of mail balloting. Several analysts were able to improve The 
Literary Digest results by relatively simple adjustments, [73] but, in 
field surveys in which interviewers are left free to use their judgment in 
selecting a representative group of respondents, a better procedure 
appeared to be to regulate the selection of the sample by assigning to 
each interviewer a quota of persons of various kinds so as to obtain the 
correct proportions of men and women, old and young, rich and poor, 
etc. in each city or district that was canvassed. This procedure was de- 
veloped by Cherington, Roper, Gallup, Crossley, whose sampling sur- 
veys of opinion became widely known after the election of 1936, and by 
market research and opinion survey organizations, whose studies re- 
ceived less publicity [74]. The general principles of these and related 
methods have been discussed by Brown, Franzen and Robinson, by 
Cassady, and in two symposia [75]. Quota methods have been em- 
ployed in the surveys of the National Opinion Research Center [76], 
the Office of War Information and the British Wartime Social Survey 
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[77]. Most of the opinion polls and market research studies around the 
world at present employ quota methods or related procedures of sam- 
pling. These methods have been adopted partly by imitation of the 
more widely known surveys and partly because they are attractive with 
respect to costs and practical convenience. 


VII. GENERAL VIEW OF THE PRESENT PERIOD 


Current uses of sampling will be described by other speakers at this 
session. For the most part these uses were connected with the war 
which proved to be an even greater accelerating stimulus than the de- 
pression of the 30’s. They included surveys of employment, housing, 
prices, shortages of consumer goods, dealer inventories, public reaction 
to wartime measures, characteristics of selectees, the attitudes of sol- 
diers, effects of bombing in England, Germany and Japan, public opin- 
ion in many countries, inspection of election registers in Greece, surveys 
of saving and spending, employee attitudes, reading and radio listening, 
social and psychological factors affecting fertility in the U. S. and in 
Britain, etc. In manufacturing and procurement, sampling methods 
were perfected for quality control and inspection of materials, parts, 
and finished products. Market research and opinion studies by private 
research agencies and business concerns developed rapidly, with con- 
siderable emphasis on measuring public relations and employee atti- 
tudes. The volume of work increased and many continuing or repetitive 
surveys were established, making it possible to spread the costs and 
hence devote greater resources to the design of each sample than had 
been possible for infrequent sampling surveys. 

Other speakers will describe the more recent developments that 
grew out of these earlier accomplishments. The Journal of the Royal 
Statistical Society and its Supplement, the Journal of the American 
Statistical Association, the Annals of Mathematical Statistics and many 
of the scientific and professional publications in various fields have pub- 
lished the increasingly numerous articles that reflect the progress of 
applied sampling. The reader should review them, beginning in 1940 or 
earlier, and search articles that report particular studies as well as those 
that are principally devoted to discussions of method. 

Great as these advances have been, they are only the forerunners of 
a, broad expansion of sampling in the future. Further improvements in 
method are also to be expected. Much experimental work is being done 
within and outside government bureaus. The Committee on the Meas- 
urement of Opinion, Attitudes and Consumer Wants, appointed jointly 
by the National Research Council and the Social Science Research 
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Council, is making a comprehensive study of sampling methods as 
they are used in its field. The United Nations Statistical Commission 
has established a sub-commission on sampling which will contribute 
greatly to the development of sampling surveys in many countries. 
The International Statistical Institute can perform a great function in 
extending tne use of sampling procedures in the many statistical ac- 
tivities throughout which it exerts its leadership and influence. 
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THE USE OF MARK SENSING IN A LARGE SCALE 
TESTING PRGGRAM* 
Wa tter L, DEEMER, JR. 
AAF School of Aviation Medicine, Randolph Field, Texas 


A description is given of the use of mark sensing in the 
Psychological Program of the Army Air Forces. The prin- 
ciples of mark sensing are summarized and the mark sensing 
cards of the Psychological Program are exhibited. Some gen- 
eral principles are given for use in deciding on when to use 
mark sensing in a testing program. 


tT 1s the purpose of this paper to describe the experience of the Psy- 
| ete Program of the Army Air Forces in the use of mark 
sensing, so that this experience may be utilized by others who may 
have problems similar to those we encountered during the war. 

There are five sections to the paper: 1) a brief history of the testing 
program in which mark sensing was used; 2) a statement of the prob- 
lems which we attempted to solve by mark sensing; 3) a description of 
mark sensing; 4) the plan that we devised; and 5) an evaluation of the 
plan with suggestions for future uses of mark sensing in large scale 
testing programs. 

During the early stages of the war, the testing of aircrew candidates 
for classification as Bombardier, Navigator, or Pilot was accomplished 
at three centers, one located at Nashville, Tennessee, one at San An- 
tonio, Texas, and one at Santa Ana, California. A central records 
agency was established at Ft. Worth, Texas, at Headquarters, AAF 
Training Command. At the testing centers there was administered to 
each candidate a battery consisting of twelve to sixteen paper and pen- 
cil tests, and five or six psychomotor tests. For a list of the tests admin- 
istered see Appendix A. The paper and pencil tests were administered to 
groups of 100 to 300 candidates; the psychomotor tests were adminis- 
tered to groups of four candidates, typically on the day following the 
group tests. Each psychomotor test required fifteen minutes to admin- 
ister. For purposes of administrative control, and for research it was 
desired to have recorded on punched cards in Ft. Worth all the scores of 
every candidate tested and also certain supplementary information, 
some of it for identification, such as name and army serial number; 

* This paper was originally presented at the Educational Research Forum, Endicott, N. Y., 


August 28, 1947. It represents the opinion of the author only and is not to be construed as representing 
the official opinion of the War Department or of the Army Air Forces. 
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some of it biographical, such as age, marital status, education and 
previous flying training; some of it for classification purposes, such as 
preference for types of training, status, physical qualifications, recom- 
mended classification, and final classification. 

In brief, then, the problem was to devise the most efficient and ac- 
curate means of getting all the above data on punched cards in Ft. 
Worth. The obvious method, of course, was to have the necessary 
rosters prepared at the testing centers and to punch cards from these 
rosters. On the basis of descriptions of mark sensing, however, it was 
decided that this method might be more efficient than rosters. Ef- 
ficiency was important because of the large testing load. An average of 
over 14,000 men per month was tested during the last half of 1942. 

Before describing the plan that was devised, it might be well here to 
summarize briefly how mark sensing works. Figure 1 shows Group Test 
Mark Sensing Card No. 1. The particular use of this card in the testing 
program will be described later. It will be described now as an example 
of mark sensing in general. 

The mark sensing portion of this card begins at column 16. Each of 
the small ovals with numbers in the center is a mark sensing position 
and covers three ordinary punching columns. There are 22 mark sensing 
columns on this card. (A total of 27 mark sensing columns is the maxi- 
mum that can be put on a card.) There are twelve mark sensing posi- 
tions in each column. The ovals are immediately above the correspond- 
ing punch positions. In use, the ovals are marked with a horizontal line 
by a special pencil. The cards are processed on the 513 Reproducing 
Punch, to which a mark sensing unit has been attached. The mark sens- 
ing brushes consist of sets of three regular brushes. In each set of three 
brushes the outer brushes are common and the center brush is con- 
nected to a mark sensing brush hub on the plugboard. A mark is sensed 
when it bridges the gap between the center brush and either outer 
brush in that group. The marks are sensed conductively rather than 
photo-electrically. It is necessary, however, to amplify the current 
through an amplifier unit using vacuum tubes as relays. 

The unit is :ompletely flexible in that each mark sensing column may 
be punched in any column on the card. The card in Figure 1, for exam- 
ple, has been laid out for the first two mark sensing columns to be 
punched in columns 8 and 9. 

The speed of mark sensing is the standard reproducer speed of 100 
cards per minute. Ten mark sensing columns can be sensed at one time 
by one mark sensing unit. Completely to process the card in Figure 1 
requires three runs through a reproducer with one mark sensing unit. 
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In order to verify that a mark has been sensed and punched, use is 
made of the double punch and blank column detection device, which 
automatically indicates a card which has a double punched column or a 
blank column. Since the blank column detection cannot be used with- 
out the double punch detection, verification in mark sensing is lim- 
ited to columns which are designed for single punching only, unless one 
of the double punches occurs in the 11 or 12 positions, and the other 
occurs in the 0-9 positions, in which case column splits may be used. 
The double punch detection device does not detect double marks on 
the cards; it detects double punching of the cards. 

The original plan for applying mark sensing in The Psychological 
Program was devised by Colonel John C. Flanagan, who worked out a 
detailed plan for psychomotor testing. Each candidate was to be given 
a different card for each psychomotor test. When he entered the room 
for the test he gave his card for that test to the examiner, who on 
compietion of the test marked the test scores on the card and filed it. 
The cards for this were planned and printed very early in the program, 
before it was known definitely just how many psychomotor tests would 
be used in the battery. Ten cards were prepared for each candidate in 
order to be sure to have enough. These were card forms like that in 
Figure 2, whieh is the number 9 card. 

The first five columns of each card contain an identifying number 
called the testing number. In column six is punched the number of the 
card in the set. The four mark sensing fields with a total of 8 mark 
sensing columns were for the psychomotor test scores, which for each 
test consisted of a total raw score, a standard score, and one or two part 
scores. The four mark sensing fields are in the same position on all ten 
cards. Each card must therefore be processed by itself, that is, all the 
number 1 cards were mark sensed and the standard scores for test 1, 
marked in field A, were punched in columns 37 and 38. For test 2, field 
A was punched in columns 2° and 40, etc. The marks in the other 
fields were punched in other columns. After each set of cards was 
mark sensed all cards were sorted together on columns 1-6 and se- 
quenced on the cojlator, thus putting all cards of each man together 
and in order by test. They were then put through the reproducer and 
by interspersed master gang punching all the scores were combined in 
the 9-card. The 9-card thus became a summary card. The other cards 
were stored in semi-dead files, as reference to them was necessary only 
for verification of scores. All analyses were performed on the summary 
card. When less than ten psychomotor tests were administered we used 
less than ten cards, but the 9-card was always the summary card. 
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Figure 3 shows a revised individual test mark sensing card which 
allowed more space for recording scores. 

It was necessary to make complete plans for reporting scores before 
any actual scores were recorded in Ft. Worth, and in fact before the 
extensive testing program was started at the testing centers. After a 
good deal of discussion with IBM representatives, it was decided that 
it would be feasible to secure all the test scores and miscellaneous other 
information by mark sensing. Accordingly two cards were designed for 
paper and pencil test scores. The first of these is that shown in Figure 
1, and the other is Figure 4. There is room on these two cards for 17 
two-digit scores. These cards have an advantage over the first psychom- 
otor test card in that the z and y positions are available for marking 
scores that are unknown, making it possible to insist on a mark in every 
column, so that the blank column checking device could be used ef- 
ficiently. Each card could be mark sensed in three runs through the 
reproducer. Card 1 was reproduced into card 2, making card 2 the sum- 
mary card. The fields numbered 1 to 17 were for test scores. The three 
two-digit fields labeled Bomb., Nav., Pilot, were for two-digit composite 
scores giving measures of aptitude for each type of training. These 
composite scores were based on weighted averages of paper and pencil 
tests and psychomotor (apparatus) tests. For ease of use, these two- 
digit scores were converted to single digit scores running from 1 to 9. 
These single digit scores, called stanines, were entered in the three fields 
near the right edge of card 2. The last column in card 2 had entered in it 
the amount of credit received by the candidate for previous flying 
training he may have had. The actual process involved in entering these 
scores will be described more fully later. 

The other two cards used in the testing program are the name card 
and the stub card. The name card is shown in Figure 5. The first edition 
of the stub card is in Figure 6. The second edition is in Figure 7. 

The name card and the stub card were presented to the candidate 
at the first testing session. They were connected by a small stub and 
both were pre-numbered and pre-punched with the testing number. 
Testing numbers were allotted in blocks to the testing centers, and 
were assigned in chronological order to the cadets, so that a numerical 
order by testing number was also in order by testing center and by date 
of testing. The assigning of numbers was automatic since the candidate 
wrote his name on the name card and the stub and that number was 
thus assigned to him. His testing number was pre-printed and pre- 
punched on all his other mark sensing cards, both paper and pencil test 
cards and psychomotor test cards. A pre-punched, pre-printed card 
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appears in Figure 2. The other cards shown in the figures are samples 
without the pre-printing and pre-punching. .- 

The name card (Figure 5) did not require any mark sensing. The ca- 
det filled it in by writing the date, his name, army serial number and 
date of birth, and by encircling the appropriate entries for marital 
status, education, and previous flying experience. This card was de- 
signed for self coding; that is, the punch operator started punching in 
column 41, with serial number, after date of testing was duplicated in 
columns 37-40 from a master card. As the card went through the punch 
the rest of the information became visible. 

The stub was torn off by the candidate after filling in name and 
army serial number, and was retained by him as a record of his testing 
number, which he wrote on all his group test answer sheets. Prior to the 
psychomotor test session the candidate turned in his stub and received 
his set of psychomotor test cards with the same testing number. The 
mark sensing information on the right of the stub card was filled in at 
the testing center after the candidate had been tested and classified. 

All cards of the candidate were sent to Ft. Worth as they were com- 
pleted. The name cards were a source of information on the number of 
people who had started testing. The stub card information was some- 
times not available for some weeks or even months after the candidate 
was tested. The name cards for such candidates were kept in a special 
file and permitted a rapid count of incomplete processing. 

The mark sensing part of the stub card was completed at the testing 
center as the necessary information became available. The first edition 
of this card (Figure 6) was quite complex and illustrates a great many 
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of the pitfalls to be avoided in mark sensing. The first mark sensing field 
gave physical qualification. The Prev class space at the top of the first 
field was marked if the candidate had previously beer: classified else- 
where. The column of letters beginning with B to the right of the first 
field gave the labels to be attached to the rows of positions all the way 
to the right hand margin: B for Bombardier, N for Navigator, P for 
Pilot, G for Ground, and the other letters for possibilities which might 
arise. Actually there did arise other classifications, such as flight en- 
gineer and gunners. The four columns in the field headed JND repre- 
sented the first to fourth preference of the individual for each type of 
training. For example, an individual whose first choice was Pilot, sec- 
ond Navigator, third Bombardier and fourth Ground would have had 
his card marked opposite the P in the first column, opposite the N in 
the second column, opposite the B in the third column, and opposite 
the G in the fourth column. The rows are the aircrew positions and the 
columns are the choices from 1 to 4. In the next field, headed P.R.U., 
were given the four recommendations of the testing center (Psycho- 
logical Research Unit); the final classification by the Classification 
Board was entered in the field headed Board. 

The top three rows contained special codes not related to those in 
rows 1 to 9. The row headed SP showed the strength of preference for 
the individual preferences. That is, it indicated how much importance 
the candidate wanted attached to his stated preferences. 

The line above that, headed Elim, represented types of training from 
which the candidate may have already been eliminated. Many men who 
had been eliminated from Pilot training, for example, were tested for 
possible entry into Bombardier or Navigator training. The last nine 
columns in this row were not used. The Y row headed Prob repre- 
sented any training which was recommended or assigned probationally. 

It will be seen that double punching on this card occurred in the four 
columns headed JND, and triple punching in this field was possible. 
Thus if the machine were wired normally, there would be a stop on 
every card since at least one of the four columns would be double 
punched. The split column device could not be used to separate the 
punches because it splits between 11 and 0, and the double punch 
found here consisted of 0 and a punch from 1-9. Consequently, the 
double punch detection had to be omitted from this field. Since the 
double punch blank column detection device on the reproducer will not 
operate for blank columns without operating for double punches it was 
not possible to use either of these checks. This turned out to be a very 
serious difficulty. 
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Mark sensing is subject to two sources of error, one in the marking 
and one in the processing in the reproducer. In the stub card we were 
getting at least our share of errors from both sources. 

The directions for marking the stub card were prepared very care- 
fully, but even after some months of use we occasionally found errors in 
marking due to failure to follow directions; this is not surprising in view 
of the complexity of the card layout. 

By far the largest number of errors in marking the card were due to 
carelessness, however, rather than to misunderstanding. We found 
almost every conceivable type of error: columns not marked at all, too 
many marks, impossible entries, fields transposed. 

We found, too, that even correctly marked cards fairly often came 
out with missing punches or too many punches. 

Because we could not use the blank column double punch detection 
device on the reproducer we had to find the errors by tedious methods, 
such as running lists on the tabulator for visual checks. This was far 
from satisfactory either for accuracy or speed. 

We also experienced difficulty in processing many stub cards be- 
cause of their bad condition due to handling. It will be remembered 
that the candidate kept the card for one day. We had prepared a crease 
for folding the card so that it could be carried easily in a pocket. The 
creasing alone did not cause any great processing trouble. But the com- 
bination of creasing and handling and carrying in pockets by canaidates 
often did mutilate the card so that it was difficult to process. We had 
further processing difficulties because of the serrated edge when the 
card had been detached from the stub. Unless constantly watched and 
checked the punches from the mark sensing would be slightly out of 
position. This would eventually cause trouble in the reproducer and 
collator. 

After a few months of use we revised the stub card. The revised 
card is shown in Figure 7. In the new edition all double punching and 
all blank columns have been eliminated. That is, each column can be 
marked correctly in only one place, and for every candidate there will 
be information for every column. 

One reason for revising the stub card was an administrative change in 
reporting preferences. The candidate was asked to show his strength of 
interest in each aircrew position on a one to nine scale, instead of rank- 
ing his preferences from one to four. Nine positions were therefore 
necessary for each category (B, N, P, other), making necessary 2 
change in the card dimensions used for the positions. That is, where in 
the first card we had B, N, P as rows, we now have them as columns. 
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We could not change the dimensions of PRU and Board Classifications 
because of the extra data which we had to insert. Hence we were left 
with three fields—individual preference, PRU recommendation, and 
Board Classification—all of which should logically have aircrew posi- 
tion as either rows or columns, but which actually had different di- 
mensions. This caused us some difficulty in preparing rosters for out- 
side use. 

The cards on which the psychomotor test scores and the paper and 
pencil test scores were marked were much simpler than the stub card 
and caused less trouble in processing. Nor was there any apparent mis- 
understanding of directions for marking. Certain difficulties were en- 
countered with these cards, however. At first the accuracy of entering 
scores on mark sensing cards was a good deal less than the accuracy 
of entering scores on a roster. This was discovered by having two 
crews alternate on both methods of entering. The accuracy was im- 
proved by careful supervision, but all testing centers reported a very 
definite dislike on the part of the clerks for entering scores on mark 
sensing cards and this probably contributed to inaccuracy. 

It should be noted that the personnel who were doing the recording 
were well trained and highly competent. As a matter of fact, many of 
the clerks were enlisted men who were probably much more intelligent 
(and more interested in their work) than the usual clerks who would or- 
dinarily be used for this work. Perhaps the high level of intelligence (as 
shown in part by high Army General Classification Test scores) con- 
tributed to the dissatisfaction felt with the tedious process of marking 
the cards. 

We encountered some informal objection to mark sensing from one 
or two senior officers who stated that a permanent original record of 
scores should not consist of marked cards but should consist of rosters. 
In the case of paper and pencil tests, this objection was unfounded, 
since the original record was the IBM answer sheet. In the case of the 
psychomotor tests, however, there was some merit in the ovjection, 
_ since according to the plan for use of the mark sensing cards no other 
record of the score would be kept. This objection was never com- 
municated formally. 

This objection to the use of mark sensing cards in psychomotor test- 
ing was reinforced by the need for keeping records for research pur- 
poses of many part scores on the psychomotor tests. There was no 
room on the mark sensing card for these scores, and furthermore the 
testing unit wanted a record of the part scores for its own use. After a 
short while, therefore, we found that what could have been a real labor 
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saving method had it been usable as planned, was becoming a burden 
because it was being superimposed on the old method; that is, score 
sheets for each psychomotor test were being prepared at the time of 
the test and the mark sensing cards were being filled in later. 

This duplication was soon evident in all the mark sensing operations. 
For local administrative purposes rosters of the information on the stub 
card were needed sooner than they could be furnished from Ft. Worth. 
The result was that the examining centers were making rosters of the 
same information they were putting on the mark sensing cards. Since 
the centers were having difficulty in securing sufficient clerical help, the 
extra labor of making mark sensing cards was a distinct burden. 

Certain difficulties were encountered with mark sensing in Ft. 
Worth. One difficulty was in storing the cards which were not used as 
summary cards. These consisted of one paper and pencil test card and 
nine psychomotor test cards, a total of ten cards per man. These cards 
were the original records which had to be referred to in case the scores 
on a summary card were questioned. They therefore not only had to be 
preserved, but had to be readily available because many questionable 
scores were found on summary cards as they were processed. There was 
thus a large job of filing these cards. Furthermore, reference to these 
cards was tedious because of the bulk. Finding the score on a certain 
test for a certain individual was a rather long operation after thousands 
of men had been tested. 

The problem of original records had further ramifications. When a 
card was mispunched as it was mark sensed, either because of inac- 
curate marking or machine malfunction, a correct card was made. But 
the mispunched card had to be saved since it was an original record. 
There developed therefore an extensive file of “spoiled originals.” 
When a search of the main file showed a remade card, it was necessary 
to search the file of spoiled originals for the original card. 

Eventually, after a very thorough tryout, mark sensing was aban- 
doned, principally because of the difficulties at the testing units which 
have been outlined. 

The testing program which we were operating was dynamic. The 
test battery was changed two or three times a year, and other changes 
such as method of reporting preferences had to be made. The kinds of 
supplementary information which was needed changed from time to 
time. We found that new mark sensing cards incorporating the changes 
took a good deal of time to print. It was difficult to anticipate changes 
in the testing battery in time to have new cards prepared. 
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A summary of the factors which we found to be important in the 
use of mark sensing is given below: 

First, and probably most important from the operational side, is 
complexity of layout. If a complex layout is unavoidable it should be a 
warning signal against mark sensing. Complexity of layout may con- 
sist either in complex codes which lead to difficulty in the marking, or 
in double punches and blank columns which lead to difficulty in the 
processing. 

Secondly, mark sensing is contra-indicated when a permanent 
original record is necessary. There are two reasons for this: first, the 
cards are more bulky to file than typed rosters, and, second, the cards 
are hard to establish as original records in cases where scores must be 
verified for agencies outside the testing program. 

A third point to consider is whether a record in other form than a 
mark sensing card is needed before the cards can be processed and a 
roster prepared by machine. If such a record is needed it may be more 
economical to use this record for manual punching. 

A fourth point is the time necessary for revising cards in the case of a 
dynamic program. If changes are going to be at all frequent, a mark 
sensing system may be a liability, since a change can be effected no 
faster than new cards can be designed and printed. 

I want to emphasize and make explicit a point that has been implicit 
throughout the preceding discussion, namely, that our final decision 
to give up mark sensing is not a reflection on the value of mark sensing 
when it is used under conditions for which it is suitable and for which it 
has been designed. I have felt only that by presenting our experiences 
it would be possible to delineate more exactly some conditions for which 
mark sensing is not suitable. 


APPENDIX A 


Tests used in the Aviation Psychology Program of the AAF for the 
selection and classification of Aviation Cadets. 

For a complete description of the tests and of the testing program 
see the Research Reports of the Army Air Forces Psychological Pro- 
gram, U. S. Government Printing Office, 1947. 


Paper and Pencil Tests 


1. Technical Vocabulary 
2. Speed of Identification 
3. Mechanical Principles 
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Mechanical Information 
Mathematics 
Arithmetic Reasoning 


. Numerical Operations 
. Spatial Orientation 


a. Matching aerial photographs with other aerial photographs 
b. Matching aerial photographs with maps 

Reading Comprehension 

Dial and Table Reading 


Psychomotor Tests 


IAT PR wD 


. Complex Coordination 

. Discrimination Reaction Time 
. Two Hand Coordination 

. Rotary Pursuit 

. Aiming Stress 


Finger Dexterity 


. Rudder Control 








A PROPOSED BASIC COURSE IN STATISTICS 


GrorGE W. SNEDECOR 
Professor of Statistics, Iowa State College 


INTRODUCTION 


BASIC introductory course in statistics was advocated by the Na- 
tional Research Council Committee on Applied Mathematical 
Statistics in its stimulating report of May, 1947, Reprint and Circular 
Series, Number 128. The functions of such a course are stated as fol- 
lows: “First, it should form part of a general education and as such it 
should be self-contained. Secondly, it should provide essential training 
for students majoring in the natural and social sciences, which may be 
developed further in later courses. Finally, it should interest promising 
young students in statistics as a profession.” 

Concerning laboratory work in statistics, the Report contains these 
penetrating remarks: “Often the main objective of such laboratories is 
the calculation of means, variances, correlation coefficients and other 
statistical quantities from numerical data of various types. The labora- 
tory work should place more emphasis on interpreting or drawing in- 
ferences from data and on the nature of those inferences. Simple ex- 
periments should be devised for illustrating probability laws of various 
kinds, for carrying out sampling operations and other random processes. 
The traditional flipping of coins and rolling of dice are not adequate for 
illustrating many important random processes. The mathematical 
theory of many of these processes is too complicated to be handled at 
an elementary lcvel; their experimental demonstration will give the 
beginning student some feeling for their significance.” 

The writer finds himself in entire agreement with this report of the 
Committee on Applied Mathematical Statistics. He finds, also, that 
many other teachers realize the need for revision in the traditional 
methods of presenting the subject. These statisticians are not only 
striving to reorient their own ideas but are experimenting with new 
methods of teaching. It seems timely to propose a concrete program 
which may serve qs ‘a springboard for discussions. 

Two merging trends emphasize the necessity for changes in the teach- 
ing of statistics. One trend is the astonishing growth in the impact of 
statistics on society. For many years after the American Statistical 
Association was founded in 1839, popular knowledge of the subject was 
confined almost entirely to governmental statistics and the data of 
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economics, though insurance was beginning to enter its present domi- 
nating position in our social structure. Today statistics has captured 
the popular fancy. The extensive data on the sports pages, the various 
opinion polls, popularity tests of many kinds including those of radio 
programs, the cost-of-living indexes with their repercussions on wage 
policies, surveys of consumer preference, crop estimates, quality con- 
trol—these are some outstanding examples of the preoccupation of 
people with statistical evidence. 

The second trend merging with the first is the no less astonishing 
growth of statistical theory. Estimation and the testing of hypotheses 
have clothed with living flesh the dry bones of numerical data. The 
emphasis on sample-to-population inferences has put new meaning into 
statistical terminology. The Fisherian concept of “information” has 
shifted interest from the formal rules of calculation and summarization 
to the vital processes of getting information into the data by use of ap- 
propriate sampling and experimental designs. 

The merging of these two trends into a single stream makes new de- 
mands on our statistical personnel. Neither the isolated theorist nor 
the submerged practitioner is able to keep abreast of the current. The 
specialist in mathematical statistics must acquaint himself with practi- 
cal problems, and equally, the work-a-day statistician must familiarize 
himself with the logic of statistical theory: fortunately, this is being 
made more readily available to those who do not have the benefit of 
mathematical symbolism. It is only by the intermix‘ure of the two 
streams that statistics can freely flow onward. 

The fact should not be overlooked that the recent upsurge of interest 
in statistics is based on an ancient pattern of thinking, a form that has 
developed along with the thought process itself and that is more primi- 
tive and more extensive than the logical forms. Concepts of type and 
of departure from type are counterparts of statistical averages of 
location (means, regression coefficients, etc.) and of measures of scale 
such as the interquartile interval and standard deviation. The process 
of sampling is deeply embedded in human actions. Judgments about 
probability together with consequent behavior determine much of the 
pattern of our daily lives. Anyone entering the profession of statistics, 
especially the teaching of it, should enjoy the confidence that he is 
engaging in a fundamental activity of mankind. 

Since statistics is so intimate a part of our social organization, the 
conclusion is inevitable that it should be taught generally to our young 
people. As indicated in the National Research Council Committee 
report, the elements will doubtless be introduced into the high school 
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curriculum as soon as there is a sufficient supply of trained teachers. 
Meanwhile, the basic course, taught at the freshman or sophomore 
level, should serve as the foundation of college curricula to produce such 
teachers along with other professional statisticians. 

Essentially, my idea is to bring the student into awareness of and 
harmony with the statistical content of our society. This content is 
extensive. It includes gossip, news and probability theory; sports and 
old age pensions; gambling and weather prediction; birth rates and 
living costs; the stock market and epidemics. Such large-scale ac- 
tivities as insurance and the census must be integrated with the more 
academic concepts of probability, distributions, sampling, estimates 
and tests of hypotheses. 

In working out the following syllabus, the author has set himself 
the ideal of presenting sound statistics in an interesting fashion. It 
would seem inadvisable to introduce so vital a subject in an austere 
and forbidding style. On the other hand, one must be constantly on 
guard lest he produce false impressions that will later have to be 
tediously eradicated. Only the elements of statistical thinking should 
be incorporated in this basic course, but the elements should constitute 
strategically chosen timbers that will fit readily into the projected 
structure. 

It. is a deep-seated conviction of the author that the tools of statistics 

should be brought out and sharpened only after the need for them has 
been felt. Tables, graphs, and calculations of the various averages have 
all too often been presented as the subject matter of statistics rather 
than as its implements. The very young people whom we wish to enlist, 
those with imagination and intellectual enthusiasms, may be alienated 
by dull routine if introduced before any necessity for it becomes evi- 
dent. 
It is clear that no more than an outline of a basic course can be sug- 
gested now. The outline will be changed and filled in by the efforts of 
many teachers who will gain experience from trying the experiment. 
Some have already started; the important thing now is to get more 
people working at the job. 


PROPOSED OUTLINE OF COURSE 


I. Introduction. The interest people show in counting and measuring, 
Some historical items showing the antiquity of the habit. Statistical 
form of much human thinking. Uncritical attitude of people toward 
numerical statements. Illustrate fallacies. Interesting and uninteresting 
statistics. Illustrative material: batting averages, birth rates, election 
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returns, opinion polls, vital statistics, the average man. 

II. Inquiry by Sampling. Propose a contract to find out for a radio 
station the number of listeners to Program X. Develop limitations in 
extent of inquiry and in number of units interviewed. Discuss design 
of sample, including size. Quota sampling, random sampling, area 
methods. Use of mail and telephone. 

Assume sampling completed and ballots counted. Contrast known 
fraction of listeners in sample with unknown fraction in the population. 
Take confidence interval from table and explain meaning. Expand to 
population total number of listeners, using census information about 
area sampled. 

This sampling problem has been introduced ahead of the chapter on 
a priori probability for three reasons: (i) It is the newer and more prac- 
tical concept; (ii) It leads immediately into a modern social problem 
instead of into the more ancient and perhaps less honorable one of 
games of chance; (iii) The term population has its obvious and funda- 
mental meaning, sample-to-population inferences being inevitable. 
Experience has convinced me that the student feels himself on familiar 
ground. 

Laboratory: Conduct opinion poll sampling of student body; con- 
struct questionnaire, design sample, discuss interview technique, tabu- 
late results, make inferences about population. Students enjoy this ex- 
perience, ask many penetrating questions and open opportunities for 
sound instruction. 

III. Random Sampling from Population with Known Constitution. 
Use bowl containing equal numbers of beads of two colors, or use table 
of random numbers with equal numbers of odd and even digits. Draw 
many samples of 10, recording numbers of “‘successes.”” Contrast sam- 
ple ratios with that of population. Set confidence interval from each 
sample and determine proportion of correct statements made. Compare 
results with theory. Emphasize sampling variation and observe im- 
proved reliability when samples are combined into larger samples. 
Compare with poll of radio listeners where parameter is unknown. 

Laboratory: Extend sampling experience by use of coins or dice. 
Record result of each toss for later use. 

IV. Frequency Distribution. Tabulate numbers of samples with 0 to 
10 successes. Present distribution graphically. The mode as an estimate. 
The small fraction of extreme values as an explanation of confidence in 
sampling. The mean number of successes in the aggregate of each stu- 
dent’s sampling. Emphasize the greater reliability and utility of the 
mean as compared with the mode. Combine samples of 10 into larger 








J & 


eo A =~ —s mo Fr 











PROPOSED BASIC COURSE IN STATISTICS 57 


groups and observe (i) the greater variation in the number of successes 
and (ii) the lesser variation in the proportion of successes. Law of Large 
Numbers. 

V. Insurance. Next to government, the greatest cooperative social 
enterprise in America. Historical items. The premium, the expectation 
of loss, is the price of protection equally enjoyed by all participants. 
Life insurance and mortality tables. Expectation of life. Emphasize 
definite statements about uncertain events. Pure premium for single 
year term insurance. Level premiums for term and life policies. In- 
surance Vs. savings accounts. 

Laboratory: Calculating machines will be required to compute vari- 
ous premiums. 

Insurance is now a social problem of great importance. The soundness 
‘of the trend away from pure insurance which is predominantly statisti- 
cal, towards various forms of savings devices which are mainly financial, 
is questionable. The student should be able clearly to distinguish be- 
tween insurance and investment. 

VI. A Priori Probability I: Games of Chance. A primitive, universal 
human interest. Probability and expectation. Conditions of fair play. 
Effects of limits on resources and time—sample vs. population. Playing 
against The House. Systems of play. Probabilities of runs. 

Laboratory: Try a system like the Martingale, noting winners and 
losers at end of specified number of throws. Balance accounts in the 
aggregate. Work out probabilities in some game. 

This is one of the chapters that affords great opportunity for ex- 
posing popular fallacies. 

VII. A Priori Probability II: The Binomial Distribution. Develop 
first with probability of 4. Compare sample distributions with theo- 
retical. Raise question of testing hypotheses. Develop distribution with 
probability different from 4. Skewed distributions. Mean and mode. 

Laboratory: Throw dice to get data for samples for asymmetrical 
binomial. Compare samples with population. Save data for testing in 
next chapter. 

VIII. Empirical Probability. Prevalence of vague judgments and ac- 
tions based on them. Sampling basis of most calculated probability. 
Sampling from populations with unknown probability. Illustrate by 
throwing a loaded die. Calculate chi-square for hypothesis of perfect 
balance. Get distribution of chi-square from samples of 10. Test of 
hypothesis. Meaning and use of table of chi-square. Contrast samplings 
from populations with known and unknown probability. Emphasize 
the more practical problem of sampling from unknown probability with 
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resulting inferences and tests. Ordinarily the parameter is forever un- 
known, but pertinent hypotheses intrude themselves. 

Laboratory: Extend the experimental basis for the chi-square dis- 
tribution. Test an ample number of hypotheses about various samplings 
that have been made. Extend to more than one degree of freedom. Test 
the goodness of fit of samples from binomial distributions. 

This chapter is the climax of the first part of the course. A substantial 
body of statistical theory has been accumulated together with some 
practical problems and a number of socially advantageous applications. 
Sampling from specified populations has been emphasized. The bi- 
nomial distribution has been developed. A sampling distribution of chi- 
square has been built up, leading to confidence in the table. Estimates 
and tests of hypotheses have been justified and applied. Uncertain in- 
ference has been exemplified. If the student ends his contact with the 
course here he will have had experience with the fundamental concepts 
of statistics. 

IX. Measurement. Start with guesses at the length of an 18-20 inch 
bar. Distribution of guessed lengths. Measure with scale and sum- 
marize in distribution. Emphasize variation as characteristic of all meas- 
urement. Parameter is again always unknown. Develop idea of normal 
distribution as a model like the binomial. Properties of the normal dis- 
tribution. Mean and standard deviation of sample as appropriate 
estimates of parameters. Methods of calculation. Repeated measure- 
ments of the same thing compared to measurements of the members of 
a sample from a normal population. Emphasize conceptual character 
of models and impossibility of learning parameters by sampling. 

Laboratory: Let all measure height of one member of class. Distribu- 
tion of heights of men and women. Perform some psychological experi- 
ment leading to near-normal distribution. 

X. Sampling Distributions. Each member of class draws 10 or more 
samples of 10 from a normal population, using table of random digits 
for randomization. Mean and variance are calculated for each sample. 
Tabulate distribution of each, showing normality of first and skewness 
of second. Estimate population mean, and variance from each. Calcu- 
late ¢ for each sample, using known population mean. Distribution of 
t. Calculate confidence interval based on each sample and verify frac- 
tion of correct statements. 

Laboratory: Construct normal curve from parameters of sampled 
population. Fit normal distribution to distribution of means. Test 
normality of distribution of variance. 

XI. Sampling From Some Human Population. If time and facilities 
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are available, this may be an actual sampling. Usually this chapter 
will be limited to tabulation, summarization and presentation of avail- 
able data from some sample survey. Calculate mean and variance. Test 
normality of distribution. Confidence intervals. Expand estimates to 
population totals. 

Laboratory: Most of this chapter is the laboratory type of work. 

XII. Non-Normal Distributions. Rectangular and skewed. The me- 
dian as an estimate. Sample distributions of mean and median. Em- 
phasize that no known actual distribution is normal. Use of median and 
related order statistics. 

Laboratory: Draw complete set of samples from some small rectan- 
gular populations showing central tendency of means. 

XIII. Regression. Growth curves and economic trends. Calculation 
of linear regression on several assumptions. Deviations of individuals 
from trend; case studies. Estimates, confidence statements and tests of 
hypotheses. 

Laboratory: Get results of aptitude tests and college grades. Es- 
timate latter from former. Each student calculates his own deviation. 

XIV. Index Numbers. Construct a simple cost-of-living index with 
direct economic meaning—the changing cost of a specific bill of goods. 
Emphasize necessity of examining computation before attaching mean- 
ing to an index. 

Laboratory: Construct cost-of-student-living index in your college. 
Continue from year to year. 

XV. Correlation in normal bivariate population. Estimates. Confi- 
dence interval and test of hypothesis p=0. Mental tests. Inherited 
characteristics. The variance of differences. Correlation and regression. 
Rank correlation in samples from non-normal populations. 

Laboratory: Construct sampling distribution of r in small samples. 

VI. Sampling from More than One Population; e.g., men’s heights 
and women’s. Combining estimates—under what circumstances is it 
appropriate? Stratified sampling. Estimates of mean and variance— 
weights. Analysis of variance in groups. 

Laboratory: Draw samples and verify various estimates. Distribu- 
tion of F. 

The foregoing chapters constitute a course occupying two quarters. 
With omission of chapters V, XI, XIV, and the latter parts of some 
other chapters, a semester’s course is available. For a third quarter, 
one or more of the remaining topics may be expanded to suit special 
groups; quality control for engineers and industrial economists, assays 
for entomologists, etc. 
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XVII. Statistical Instruments of the Federal and State Governmenis. 
The census with population studies. Public health and vital statistics. 
Marketing and other economic statistics. Crop estimates. Relief and 
security statistics. 

XVIII. Financial and Business Statistics. Markets and market rec- 
ords. Trends and forecasts. Consumer acceptance and preferences. 

XIX. Quality Control. Historical items. Control chart with statisti- 
cal features. Contracts involving control contrasted with those pro- 
viding inspection. Sequential test. 

XX. Assays. Governmental regulations. Pure food and drugs. In- 
secticides. Vitamins. Purity of seed. 

XXI. Experimentation. Science and experiments. Alternation of in- 
duction and experiment. Control of extraneous variation. Groups and 
“randomized blocks.” Analysis of variance. Statistical control—co- 
variance. Broadening the basis of inference—tactorial design. Inter- 
action. 

XXII. The Statistvzal Attitude. Group vs. individual. The usual vs. 
the exceptional. News vs. everyday life. Detection of fallacies. Prob- 
ability vs. certainty. Evidence vs. proof. 


The mathematical accompaniment of such a course can be adapted 
to the training of the students. It may vary from simple algebraic 
derivations of formulas with practice in the notation to a full course 
in mathematical statistics. Since the logical concepts of statistical the- 
ory can be presented and verified experimentally, without the symbol- 
ism of mathematics, it is the author’s opinion that the mathematical 
formulation should iollow some more general presentation such as that 
outlined in this syllabus. 

On writers of texts for this course will rest the heavy responsibility 
of making their appeal directly to the student. At present there are not 
enough teachers to go around. An instructor’s manual containing back- 
ground material (such, for example, as the data from a consumer’s 
preference survey) would seem to be a necessity, since this kind of 
source material is not available to many. In fact, such a manual may 
well be expanded into a book on the teaching of elementary statistics. 











ACTUARIAL ESTIMATES FOR PUBLIC SICKNESS 
INSURANCE PLANS 


ABRAHAM M. NIEssEN* 
U. S. Railroad Retirement Board 


Proponents of social insurance consider sickness insurance & 
logical extension of the unemployment compensation pro- 
gram. Two state plans and one Federal sickness insurance plan 
are now in operation, and a number of other states are giving 
serious consideration to this matter. 

This paper explores the possibility of making cost estimates 
for public sickness insurance plans along established actuarial 
lines. The use of the actuarial approach may be of importance 
when long-range estimates are desired. 

Appropriate actuarial techniques are presented for a typical 
public sickness insurance plan. The method is theoretically 
simple, and does not require the use of statistics which are not 
ordinarily available. The details would depend on the pro- 
visions of the plan and on the purpose of the cost calculations. 
The general method is sufficiently flexible to be adoptable in 
various situations. 


N RECENT YEARS a number of states have considered the introduc- 

tion of sickness insurance plans for workers covered under the un- 
employment compensation schemes. Two states, namely Rhode Island 
and California, actually enacted laws to this effect. A similar develop- 
ment took place with respect to railroad workers for whom a nation- 
wide sickness insurance program was established by the amendments 
enacted in 1946. The methods used in connection with this type of in- 
surance naturally fall in two categories depending upon the purposes 
at hand. For short term cost estimates, methods commonly employed 
by statisticians and economists may be advisable. However, for long 
range estimates actuarial methods would seem preferable. 

To the knowledge of the author no actuarial technique has as yet 
been developed for long range cost estimates in connection with large 
public sickness insurance schemes. This paper constitutes an attempt 
to develop new techniques along lines similar to those used in pension 
fund work. It is believed that the method here presented is in principle 
simple and that it can be readily applied without recourse to any spe- 
cial statistics that are not usually available. It is of course recognized 
that at the present time none of the public insurance plans has been in 


* The opinions expressed in this paper are those of the author and do not necessarily represent the 
official views of the Railroad Retirement Board. 
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existence long enough to furnish experience upon which reliable ac- 
tuarial estimates could be made. Nevertheless, an early discussion of 
the problem may Jead to an exchange of ideas which would result in the 
establishment of a uniform method for the valuation of plans of this 
. ‘type. It is in the spirit of starting off such a discussion that this paper 
has been written. 


GENERAL CONSIDERATIONS 


Sickness rates are known to depend on age, race, sex, occupation, 
geographical location, employment opportunities, and to some extent 
also on the duration of coverage. Other factors to be considered are 
variations by calendar year and secular trends. There is considerable 
difference between making estimates for a “going” system as compared 
with a new one, since the sickness experience of different programs will 
generally be affected by the relation of benefits to wages, administra- 
tive procedures, and by the existence of other related social insurance 
programs. From a short range viewpoint most or all of these factors 
are important. However, when a long range cost estimate is attempted 
it is practical to consider only the most important factors. It is be- 
lieved that a sufficiently reliable estimate may be obtained by consid- 
ering only variations by age, sex, and race. This approach would seem 
particularly justifiable in the case of a compulsory state-wide plan pro- 
vided that experience has been available for a period long enough to 
be considered representative. Even preliminary estimates can be de- 
veloped by using actuarial methods. In such a case, the basic rates and 
averages would have to be estimated on the basis of available sickness 
rates and previous estimates made in connection with the operations 
of the unemployment compensation program. 

The main advantages of the method presented in this paper lie in 
the fact that the calculations can be performed with a minimum of 
statistics, and that provision will be made for the effect of reserves. It 
is recognized, of course, that a sickness insurance plan in which the 
right to benefits terminates not later than two years after the year of 
last employment does not have to cope with the problem of deferred 
benefits. To this extent, the question of reserves is of much smaller 
significance than in a retirement plan. However, even in public sick- 
ness insurance plans reserves are necessary in order to prevent frequent 
changes in the level of contributions, and to provide a safety margin for 
contingencies. 

The mathematical development presented in this paper generally 
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follows along the lines of actuarial techniques established for the valu- 
ation of large self-insured pension funds. Because of the different nature 
of the problems involved many significant modifications had to be 
made. In pension fund work it is customary to consider not only the 
age of the employees but also the duration of their service. This is 
taken care of by the introduction of actuarial factors (such as with- 
drawal rates, wage scales etc.) which depend on both age at entry and 
duration. It is usually convenient to eonsider the effect of the duration 
for only a few years which are then said to constitute the select period. 
All longer durations are then combined and the corresponding calcula- 
tions proceed by attained age only. The length of the select period is a 
matter which has to be determined on the basis of preliminary studies. 
It is not certain that the duration of covered employment will have a 
considerable effect on the experience of a sickness insurance program. 
It was felt, however, that a theoretical discussion of actuarial tech- 
niques for such programs should indicate the procedures which would 
have to be followed in case it is decided to make allowances for the 
effect of the duration of substantial employment. For this reason, it 
was deemed advisable to introduce a select period of one year. It would 
obviously be an easy matter to modify the mathematics so as to elim- 
inate the select period aitogether, or to introduce a select period of 
longer duration. 

In order to avoid trivial detail the assumption was made that the 
reader is familiar with tabulating procedures and that he is acquainted 
with the rudiments of general actuarial theory. For the benefit of the 
reader who might wish to refresh his memory on some particular topic 
a few references have been appended at the end of the paper. Reference 
(e) has been included mainly because the author of this paper is in- 
debted to Mr. Rusam for a few ideas regarding the construction of 
actuarial functions for sickness insurance calculations. 

Since it was desired to present a general outline of the proposed 
method, the convenient assumption was made that the only break- 
downs will be those by age and duration. If other breakdowns, as by 
race, sex, and occupation, are deemed necessary, the method would 
have to undergo some slight modifications especially in the part dealing 
with the level cost calculations. 


DESCRIPTION OF THE PLAN 


The sickness insurance plan under discussion is assumed to have the 
following basic characteristics. 
1. Participation is compulsory on the part of all workers covered 
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under the state unemployment compensation program. No 
other individuals are admitted. 

2. The plan is of the monopolistic type, that is, no substitution 
of private group insurance is permitted. 

3. Contributions are in the form of a level percentage of taxable 
wages without differentiation by age, sex, race, or occupation. 

4. No experience rating is anticipated. 

5. The plan is to be financed on an actuarial basis including the 
maintenance of proper reserves. 


6. The only requirement for insurability is the earning of a cs 
stated minimum of wages in a given calendar year referred to as W 
the “base year.” This stated minimum of wages is thereafter in the b: 
text referred to as “qualifying wages.” tl 


7. Workers insured at the end of a given base year are eligible 
for benefits in a succeeding one-year period referred to as the 


“benefit year.” Under the Rhode Island plan, for instance, the It 
base year is a calendar year and the benefit year is a 12-month q 
period beginning in April of the next calendar year. Throughout fc 
the benefit year the individual benefit rate remains constant se 
depending on the worker’s earnings in the base year or in a si 
specified portion thereof. ef 


8. There is an initial waiting period in each benefit year. Bene- 
fits are paid on a weekly or daily basis up for a specified maximum 
duration. 

9. The plan pays cash benefits only when there is a loss of wages 
as a result of sickness. However, the benefit rate depends only 
on the total creditable wages in the base year. No medical ex- 
penses or death benefits are included. 

10. The amount of benefit does not depend on the worker’s 
marital or parental status. In other words, there are no family al- 
lowances. It is also assumed that the benefit payments have a 
uniform duration for every qualified worker. 


STATISTICS REQUIRED 


It is assumed that with respect to each worker the following data 
will be available. 
1. Social security account number. 
2. Sex, race, and occupation. 
3. Year of birth. 
4. Taxable wages in the base year with special provisions made 
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for showing quarterly earnings if the benefit is based on earnings in 

a specified quarter rather than in the whole year. 

5. Calendar year of experience (base year). 

6. Worker’s status as of the end of the preceding base year 
(insured, not insured). 

7. Worker’s status as of the end of the base year under con- 
sideration (apparently insured, not insured because of failure to 
earn qualifying wages). 

The above is the minimum information to be punched in the wage 
card. In addition, a claim card is to be maintained with respect to each 
worker who received benefits in the benefit year corresponding to the 
base year. This card should contain at least items 1, 2, 3, and 5 of 
the wage card and in addition 

(a) The benefit rate, and 

(b) The total of benefits received during the benefit year. 

It is believed that such information as listed above, although inade- 
quate for a proper analysis of the sickness experience, will be sufficient 
for an actuarial estimate along the lines developed in the following 
sections of the paper. It should be noted that in addition to the analy- 
sis of the available statistics it wil] be necessary to prepare a long range 
estimate of taxable pay rolls by calendar year. 


DEFINITIONS AND NOTATIONS 


Throughout the remainder of the paper the following definitions and 
notations will be used: 


n Calendar year corresponding to the base year under 
consideration. 

x Mean age in the base year, equals m minus year of 
birth. 

EE Number of workers born in »—z who earned qualifying 


wages in the two base years n—1 and n. For con- 
venience, this group will be referred to as “insured 
permanent employees.” 

N; Number of new entrants in the base year n at mean age 
x. Denotes workers born in n—x who earned quali- 
fying wages the base year n but not in n—1. 

(wE)2** Number among the E2 who worked in the base year 
n+1 but earned less than the minimum qualifying 
wages. 
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(wN)2** Similar to (wE)2t' except that it refers to the N? 
workers. 
(rE)**' Number among the E? who had no wages in the base 
. year n+1. 
(rN)z*? Number among the N? who had no wages in the base 
year n+1. 


(SE)? Total taxable wages in n earned by the EH? workers. 
(SN)? Total taxable wages in n earned by the N? workers. 


(SO) Total of wages in n earned by all workers who failed to 
earn in that year qualifying wages. 
Pr Total taxable pay roll in n. 


(BE)? Total sickness benefits paid during the benefit year 
corresponding to the base year n with respect to the 
E? workers. 

(BN)? Total sickness benefits paid during the benefit year cor- 
responding to the base year n with respect to the 
N? workers. 


COMPUTATION OF RATES AND AVERAGES 


It is assumed that the statistics described above are available 
with respect to a period of several base years, say, from c to d. The 
symbol >> denotes summation from n=c to n=d. Benefit data ob- 


viously need be available for the benefit years corresponding to these 
base years, which will thus extend the period for which statistics must 
be available for a year and a fraction beyond the calendar year d, while 
at the same time the benefit data for the first year and a part of the 
next year of the period will be ignored. The total probabilities of loss 
of insured status, g?, for new entrants and permanent employees 
respectively are: 


Y [wnyr? + owyr] 
Qte1 = = Ne 
Y [(wEyr + (rE | 


qq: = LE 


(1) 
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The average taxable wages, sz, by age in the base year, again with a 
select period of one year, are: 


> (SN): 

S[z} = > (2) 
> (SE): 

8, = — (2a) 


 —_— 


The average benefit payments in the benefit year, b., per individual 
age x in the base year who could be eligible for benefits with respect to 
that base year, are then computed by means of the following formulas: 


> (BN): 

biz) = ——- (3) 
> (BE): 

bz = (3a) 


>> ‘ 


A set of r,’s, giving the proportionate age distribution of new entrants, 
is obtained from the formula: 


> N: 


Finally, a ratio g, which represents the portion of the total taxable pay 
roll which gives no rise to any benefit payments, is computed from: 


> (SO)* 
—_P 


(4) 


Tz 


(5) 


THE SERVICE TABLE AND ACTUARIAL FUNCTIONS 


All functions (1) through (5) can easily be obtained from proper 
tabulations. The rates and the averages may first be computed for quin- 
quennial age groups. The values by single age can then be obtained by 
one of the several standard interpolation methods. 
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At this juncture it becomes necessary to construct a service table. 
In the case of a pension fund, the service table would take separate 
account of exits from service by reason of death, withdrawal, dis- 
ability, and retirement. If we denote the tabular number in service at 
age x by If the ‘in service survivors’ a year later or ,, would be ob- 
tained by subtracting from If all exits between ages + and z+1 as 
computed on this base. As an illustration, Jet us assume that 
= 100,000 and that the total of the probabilities of exit is .2. Then, 
the computed number of exits between ages 25 and 26 is 20,000, and 
,=80,000. For the purpose of the development presented in this 
paper a special definition of exit from service was introduced. Here, an 
individual is considered to have left covered employment if for any 
reason whatsoever he failed to earn qualifying wages in a given base 
year. The combined probabilities of such exit are defined by equations 
(1) and (la). Our particular service table is then constructed as 
follows: 

Begin with a selected young age, say, 15. 


Put fs; = 100,000 
Then, lie = Ups(1 — gus) 
hy = le(1 — #9 | 6) 
’ ’ * w, 
Lost = 1.(1 = qz) J 
The select section of the service table is then computed by the for- 
mula: 
, ls 
Ti 
It will be noted that from the way the table is constructed we have 
the following relationships: 














, Those among the insured new entrants N; who also 
L414 earned qualifying wages in the base year n + 1 (3) 
Val 7 N; 
, Those among the insured permanent employees E, who 
Lest also earned qualified wages in the base year n + 1 (8a) 
= e r 
I’ E* 
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We now compute the following actuarial functions: 
Die) = v'l{z}, and D, = vl; (9) 
where v=1/1+2, 7 being the interest rate used in the calculation. 
°N. = Dibs + Diyrdeyr + +++ + Dude (10) 


where w denotes the limiting age which in practical situations will not 
be more than 80. 








°N te) =Dieybtey + Neg (10a) 
, °N {2} 
A {2} = y™ - (11) 
{z] 
nN, 
A, = om (11a) 


In formulas (11) and (lla), m is the period (in years) between the 
end of the base year and the middle of the corresponding benefit year. 

The °A, represent the average lump-sum present value of all future 
benefit payments based on wages in the base year just ended and on 
future base years during which the individual will earn qualifying 
wages continuously from the present year on. 


LEVEL COST CALCULATIONS 


Assume that a valuation of the sickness insurance plan is to be made 
as of the end of the calendar year n. On the valuation date there will be, 
say, E} insured permanent employees, and N? insured new entrants. 
In addition, benefits may still be payable for the remainder of the bene- 
year with respect to the Et~' and N2~' groups. Let us furthermore 
assume that the balance in the account on an accrual basis on the valua- 
tion date, that is the funds on hand plus any additional income due 
but not yet received less disbursements obligated but not yet paid, is 
R*, 

The present value of the liabilities with respect to persons insured on 
the valuation date may then be computed as follows: 


With respect to the E} insured permanent employees 


LF = >) E*-*A,. (12) 


With respect to the insured new entrants N} 
LN = DY) N?-*A ts}. (13) 
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With respect to the workers eligible for benefits in the remaining 
portion of the current year 


L® = vhs (Es be + Ne Dje1)-f (14) 


where h denotes the fraction of the remaining current benefit year, and 
f is a modifying factor to allow for the seasonal variation in disburse- 
ments throughout the benefit year. 


Total L; = L?+L* + L’*. (15) 


At this junction it becomes necessary to compute pay roll functions 
as follows: 
Present value of 1 per cent of all future taxable pay rolls 


P = 01(1 + 2)i >> Potty'. (16) 
k=l 

Here j is less than 1 and (1+7)/ is designed to make allowance for the 
fact, that the contributions on the average come in sooner than at the 
end of the calendar year. Thus, for instance, if contributions with 
respect to a given calendar quarter are collected at the end of the suc- 
ceeding quarter, 7 may be taken as 3. The individual present value 
*P, for an insured worker age zx of 1 per cent of future creditable wages 
taken in account only as long as the individual continues to earn quali- 
fying wages in consecutive years beginning from the present time is 
obtained from the formula: 


@a-z 


> Dose 8oth 
k=l 
*P, = .01(1 +7)! . 17 
( ) D (17) 





The present value on the valuation date of 1 per cent of such future 
wages for all employees insured by reason of having qualifying wages in 
the last base year thus becomes: 


P, = )) (Et + N)-'P;. (18) 


The individual present value as of the end of the year of entry for fu- 
ture entrants, otherwise similar to *P, is: 


Diu *Stz + D: Pp, 
Pi = 01 (Dj21°8{ _ +1°*P sti) (19) 
(21 
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The theoretical adjustment for interest on the Dj,,- 8,2; component is 
negligible and can be omitted. 

For a group of, say, 10,000 future entrants we have the present 
value of their benefits (following them only as long as they continue to 
earn qualifying wages in successive years) 


L’ = 10,000 >> rz *A ga) (20) 


and the present value of 1 per cent of their creditable wages during the 
period of continued qualifying wages is 


P; = 10,000 >> rz *P tz}. (21) 


Thus, the level cost of benefits to future entrants as a per cent of their 
pay roll is 


Zz 
t= a , (22) 
1 
The final cost calculations are then as follows: 


Item Symbol Formula 


A. Balance in the account R* 
B. Liabilities with respect to workers with qualifying 

wages in either the last base-year or in the previous 

one qh (15) 
C. Present value of 1 per cent of future taxable pay rolls P (16) 
D. Present value of 1 per cent of future pay rolls with 

respect to workers who will fail to earn qualifying 

wages g°P (5) 
E. Present value of 1 per cent of future pay rolls for in- 

sured employees taken only so long as they continue 

to earn qualifying wages, considering only workers 

with qualifying wages in the last base year P, (18) 
F. Present value of 1 per cent of pay roll of future 

entrants so long as they will continue to earn quali- 


fying wages, (C-D-E) P; 
G. Normal level percentage cost for future entrants t (22) 
H. Value of benefits for future entrants (item G times 

item F) In 
I. Total net liabilities (B+ H—A) L 


J. Net level cost on the valuation date as a per cent of 


the taxable pay roll L/P 
K. Loading for administrative expenses and con- 
tingencies (per cent of taxable pay roll) 6 
L. Total cost L/P+e 











72 AMERICAN STATISTICAL ASSOCIATION 


CONCLUDING REMARKS 


1. The formulas presented before are related to an estimate extend- 
ing into perpetuity. It is apparent that the same type of formulas can 
be set down for a calculation limited to a given number of years. Such 
a modification would not involve any special theoretical difficulties. 

2. The service table has been developed without the use of death 
rates. This approach was taken because in practice reliable statistics 
on deaths in service would not be available. It should be pointed out 
that for all practical intents and purposes deaths have been taken into 
consideration since in the development of the probabilities of loss of 
insurance the expressions on the right hand side of equations (1) in- 
clude separations through death. In addition, deaths among indi- 
viduals insured in a given benefit year were also accounted for by using 
as the numerators of equations (2) the total sickness benefits actually 
paid. If the existence of applicable mortality statistics could be as- 
sumed, a modified service table could easily be constructed, although 
such a refinement would probably not be warranted. 

3. The level cost calculations proceed on the tacit assumption that 
the ratio g (formula 5) will remain constant throughout the years. 
Such will undoubtedly not be the case. However, the ratio g is bound 
to be small so that cyclical or other variations in its value will not have 
a significant effect on the results of the valuation. The annual wage 
data published by the Railroad Retirement Board (Compensation 
and Service of Railroad Empioyees) give some indication of the mag- 
nitude of this ratio. Thus, in 1945, less than 1 per cent of the total 
creditable compensation went to employees who earned in that year 
less than $150. 

4. The method makes no allowance for future changes in relation- 
ships between benefits according to age, individual wage levels, age dis- 
tribution of new entrants, and benefit provisions. In this respect, the 
procedure is similar to the one commonly used in pension fund work. 
Instead of trying to introduce the probable variations directly in the 
calculations, it would be much simpler to make two supplementary 
estimates according to so called “high” and “low” assumptions. Since 
cost estimates of this kind are subject to considerable errors regardless 
of the mathematical refinements introduced, the supplementary calcu- 
lations are in any event necessary for the establishment of a proper 
range for the cost figure. 
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EARNINGS OF NONFARM EMPLOYEES IN THE U.5., 
_ 1890-1946 


STANLEY LEBERGOTT* 
Montreal, Canada 


Estimates of money and real earnings of nonfarm em- 
ployees are presented for the years 1890 to 1946, these es- 
timates reflecting the loss in earnings which arises from 
unemployment. Series are likewise given for full time equiv- 
alent earnings of nonfarm employees as a group and for each 
of the major industry groups for 1919-46, while the Bureau of 
Labor Statistics and the Bureau of the Census have developed 
estimates of unemployment for the period 1920-46. The 
sources of the rise in earnings after World War I are analyzed 
and it is concluded that about three-fourths of the rise in 
earnings derived from increased pay on a given job, while 
one-quarter originated in the changing distribution and char- 
acteristics of nonfarm employees. 


INCE the monumental endeavors of Paul Douglas first provided us 
with estimates of wages for the period 1890-1926, a massive devel- 
opment in the basic materials for income estimation has taken place.! 
The National Bureau of Economic Research and the Department of 
Commerce have between them provided estimates of full time earnings 
for the period 1919-1946,? and the Bureau of Labor Statistics has 
developed estimates of unemployment for the period 1920—1940.* The 
present study draws upon these materials and those of Douglas to 
provide estimates of: 
Money and real earnings of nonfarm employees, 1890-1946. 
Money earnings of employed workers by major industry groups, 
1919-1946. 
Attention is also given to the chief sources for the rise in earnings dur. 
ing the period following World War I.‘ Section 1 discusses the genera] 


* The views expressed in this article are those of the author and do not necessarily reflect those of 
any institution with which he is connected. The writer wishes to acknowledge the assistance of Sophia 
Cooper in the preparation of Tables 5-7. 

1 Paul Douglas, Real Wages in the United States, 1890-1926, 1930. 

? Simon Kuznets, National Income and Its Composition, 1919-1938, 1941; Survey of Current Busi- 
ness, National Income Supplement, July 1947. 

3 Preliminary BLS estimates for 1929-40 appear in the writer’s technical memorandum of July 
1945. The estimates for 1920-28 are on a comparable basis and will be released at a later date. 1940-1946 
data are from Bureau of the Census, Labor Force, Employment and Unemployment in the United States, 
1940-1946, 1947. 

4 Throughout this paper references are to employees in nonfarm industry. The term earnings refers 
to wages, salaries and supplements when used in connection with Tables 1 and 2 and the related text, 
but does not include supplements in other contexts. 
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trend of earnings and the sources of the rise in earnings; Section 2 is 
devoted to a discussion of full time earnings by industry; while Section 
3 deals with the actual derivation of the estimates. 

The limitations of the basic data used in this study have been dis- 
cussed in the sources from which they are drawn and are by now well 
known. But by way of caution it may be reiterated that estimates of 
unemployment, earnings and cost of living in the years prior to 1930 
are inevitably imperfect. The estimates presented in this paper par- 
take of these imperfections. Nevertheless such estimates are required 
to give us a somewhat better perspective on the course of wage earner 
incomes than can be gotten from a review of data for the years since 
1930. It is believed that the present series does give such a perspective, 
being of the right order of magnitude, moving in the right general direc- 
tion and being an improvement over existing estimates for this period. 


I 


The money earnings of the average nonfarm employee rose by 251 
per cent over the half century, 1890-94—1940—44, while real earnings 
gained by 68 per cent over the same period. As Tables 1 and 2 empha- 
size, the gain in real earnings over the decades from McKinley to 
Coolidge was a most moderate one, the real rise beginning in the much 
maligned years of the new prosperity. The effect of the depression was 
so comprehensive, so catastrophic, however, that real earnings in 1930- 
34 were almost identical with those in 1890-94, forty years before. 
Despite a vast and persistent.gain in wage rates—full time equivalent 
earnings rose 136 per cent in the interval—the fact that almost a fifth 











TABLE 1 
EARNINGS OF NONFARM EMPLOYEES 
1890-1946 
Full Time Money Real Earnings 
Equivalent Earnings Earnings (1910-14 dollars) 

1890-94 $ 553 $ 502 $ 636 
1895-99 542 468 597 
1900-04 590 554 644 
1905-09 641 597 647 
1910-14 700 649 649 
1915-19 978 918 678 
1920-24 1,392 1,235 682 
1925-29 1,492 1,384 781 
1930-34 1,307 950 640 
1935-39 1,366 1,103 721 
1940-44 1,910 1,763 1,066 


1945-46 2,508 2,408 1,261 











TABLE 2 
MONEY AND REAL EARNINGS OF NONFARM EMPLOYEES 











1890-1946 
Full Time Per cent of Time _—_— Cost of Living Res! Earnings 
Equivalent Lost by F cll (1910-14 = (1910-14 
Earnings Unemployment & 100.0) dollars) 
1890 $ 559 6.2 $ 524 81.4 $ 644 
1891 560 6.7 522 79.0 661 
1892 567 4.6 541 79.0 685 
1893 553 10.6 494 78.5 629 
1894 524 18.1 429 76.5 561 
1895 542 13.3 470 76.5 614 
1896 537 16.6 448 78.5 571 
1897 537 15.8 452 78.5 576 
1898 542 14.8 462 78.5 589 
1899 553 8.4 507 80.1 633 
1900 563 7.5 521 81.8 637 
1901 579 5.5 547 84.0 651 
1902 589 4.7 561 85.7 655 
1903 611 4.9 581 89.7 648 
1904 609 8.0 560 88.5 633 
1905 621 4.6 591 88.5 668 
1906 635 3.7 612 91.4 670 
1907 659 4.8 627 95.7 655 
1908 630 14.2 541 92.8 583 
1909 658 6.8 613 92.8 661 
1910 685 5.0 651 97.2 670 
1911 677 7.0 630 99.7 632 
1912 706 4.8 672 100.0 672 
1913 733 6.0 689 101.0 682 
1914 700 14.1 601 102.6 586 
1915 729 13.3 632 103.5 611 
1916 826 4.2 791 111.2 7il 
1917 956 "3.8 920 130.8 703 
1918 1,107 4.3 1,059 153.5 689 
1919 1,260 6.9 1,173 176.8 663 
1920 1,448 5.0 1,376 204.7 672 
1921 1,347 21.0 1,064 182.4 583 
1922 1,330 14.4 1,138 171.0 665 
1923 1,413 6.1 1,327 174.1 762 
1924 1,423 10.6 1,272 174.5 729 
1925 1,450 8.5 1,327 179.1 741 
1926 1,479 5.9 1,392 180.5 771 
1927 1,488 8.4 1,363 177.1 770 
1928 1,507 8.6 1,377 175.1 786 
1929 1,535 4.9 1,460 174.9 835 
1930 1,495 13.4 1,295 170.5 760 
1931 1,410 24.1 1,070 155.2 689 
1932 1,251 35.2 811 139.4 582 
1933 1,170 36.5 743 131.9 563 
1934 1,208 31.2 831 136.7 608 
1935 1,261 28.8 898 140.1 641 
1936 1,326 24.0 1,008 141.5 712 
1937 1,413 20.1 1,129 146.7 770 
1938 1,399 26.7 1,025 143.9 712 
1939 1,433 23.6 1,095 141.9 772 
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TABLE 2 (Continued) 





Full Time Per cent of Time Cost of Living Real Earnings 





Equivelent Lost by Bair (1910-14= (1910-14 

Earnings Unemployment 100.0) dollare) 
1940 1,470 20.1 1,175 143.1 821 
1941 1,623 13.6 1,402 150.2 933 
1942 1,896 6.4 1,775 166.4 1,067 
1943 2,195 2.6 2,138 176.5 1,211 
1944 2,365 1.6 2 327 179.2 1,299 
1945 2,456 2.6 2,392 183 .4 1,304 
1946 2,560 5.3 2,424 198.9 1,219 





of our labor force was without work had completely nullified the gain 
in rates. Within the next ten years the advent first of a qualified pros- 
perity and then of a war production program had changed famine 
to feast; real earnings rose by 67.0 per cent as compared to the 0.6 per 
cent gain over the previous four decades. Money earnings doubled in 
the six years following the outbreak of both World War I and II. But 
real earnings rose 69.0 per cent in the later period, thanks to effective 
price control, as compared to a mere 15 per cent in the earlier period. 

The scrupulous reader will keep in mind the inadequacies of any cost 
of living index which, even at best, can only take inadequate account 
of quality changes. But though doing so he may choose to set off against 
such errors the higher costs of urban existence which a cost of living 
index will likewise tend to ignore. While the measurement of family 
income is a stage removed from our present concern it seems likely that 
the trend of that series would have been similar to that of the present 
series.’ And as the present series are not measures of family income, so 
they are not measures of labor costs. Information on manufacturing 
for the 1920-43 period, for example, very definitely emphasises the 
point that labor costs may decline sharply at the same time that wage 
rates and annual earnings gain just as sharply. 

While it is not possible to establish the chief sources of the rise in 
earnings over the entire half century a very rough estimate can be 
made for the years 1919-23 to 1939-43 when money earnings rose some 
18 per cent. The over-all rise in full time equivalent earnings had been 
21.6 per cent, 4 points of which had been offset by increased unemploy- 
ment, leaving 17.6 per cent as the rise in actual money earnings. (It is 


5 In 1901 the head of family provided 80 per cent of the family’s income: in 1935-36 the figure was 
82 per cent. The 1901 estimate is based on data appearing in the Commissioner of Labor's Report 
(1904), pp. 362, 366. The 1935-36 estimate is based on ratios to be derived from the various BLS Con- 
sumer Purchases volumes, these being weighted by the distribution of nonfarm families given in Con- 
sumer Incomes in the United States, National Resources Committee, Table 24 B. 
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of interest that so sizable an offset for unemployment occurs: the earlier 
years included the entire postwar depression while the 1939-43 period 
included at least 3 years of substantial defense and war production.) 

Returning to the 21.6 per cent rise in full time equivalent earnings, 
what can we say of the sources for that rise? An exceedingly rough ap- 
proximation suggests that upwards of three-quarters of the gain came 
about because of increased earnings on a given job, while the balance 
derived from changes in the composition of the labor force. Three 
elements are significant in this composition shift: industrial change, 
occupational change, and age change. 

The importance of the industrial change may be approximated by 
using a constant 1919-23 industrial pattern for both periods in place 
of the actual ones. Tables 3 and 4 present employment and earnings 
estimates for both 1919-23 and 1939-43.° By assuming a constant dis- 
tribution of weights between the manufacturing sub-groups and be- 
tween the major industrial groups we secure an estimated income gain 
of 1.6 points less than the actual gain. In other words 1.6 out of the 
21.6 points gain in full time earnings may be assumed to have derived 
from the change in industrial composition. 

But more than a shift in industrial composition was at work. At the 
end of the period the average worker was more highly skilled. What 
portion of the rise in earnings stemmed from this source? Table 5 pre- 
sents the data used under the assumption of a constant occupational 
composition of the employee work force. Occupation data are readily 
available and as here presented exclude both professional workers and 
retail trade proprietors, managers and officials, since so large a propor- 
tion of both groups is composed of self-employed persons.” 

Earnings data are less readily come by, and even conceding a number 
of points we must assume that a comparison can usefully be made by 
taking 1920 and 1939 occupational earnings as the same. (It can be 
demonstrated that using such an assumption for the industria! data, 
where we do have earnings at both dates, makes little difference in the 
results.) We can fairly well estimate what proportion of the experienced 
workers in the labor force who were employed 12 months in 1939 were 
not wage and salary workers—hence ones whose incomes would not be 


* The employment data are derived from Kuznets and The National Income Supplement, with ad- 
justments for comparability as described below in section 3, where a description of the derivation of earn- 
ings data likewise appears. For manufacturing, the 1939-43 income using constaat sub-group weights 
was $1,693 as compared to the actual $1,764. 

7 Alba Edwards, Comparative Occupation Statistics, 1870-1940, 1943, pp. 186-187. For 1940 the 
number of employed is used; for 1920, the number of gainfully occupied. It is assumed that; the changing 
distribution of self-employed persons within the specified occupational groups would not materially 
affect the results. 
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representative of employees earnings in a given occupational group.® 
By assuming these workers to cluster at the lower end of the wage and 
salary income distribution we can then derive median wage and salary 
income of full time employees in the various occupational groups.°® 
Combining these medians first with a constant and then a varying 
occupational distribution we arrive at practically identical figures for 
1920 and 1940. Taking account of the differences between the 1920 dis- 
tribution and the 1919-23 distribution, as well as those between 1940 
and 1939-43 we can conclude that there was probably a slight, though 
insubstantial, gain because of occupational change. 

A third factor to be considered as a source for the rise in earnings is 
the changing age distribution of the labor force. By comparing the 
urban workers age distribution of 1920 with that of 1940 and assuming 


TABLE 3 


EARNINGS AND EMPLOYMENT BY MANUFACTURING INDUSTRY 
1919-23 and 1939-43 


























Average Average 
Full Time Equivalent Full Time Equivalent 
Employees Incomes 
1919-23 1939-43 1919-23 1939-43 
r 
Food 849 1,299 1,412 1,551 
Tobacco 166 102 983 1,140 
Textile 1,161 1,306 1,076 1,209 
Apparel 618 1,003 1,305 1,226 
Lumber 579 551 1,102 1,113 
Furniture 163 423 1,263 1,372 
Paper 233 360 1,340 1,689 
Printing 442 566 1,609 1,893 
Chemicals 448 619 1,567 1,948 
Petr. coal 111 165 1,844 2 ,227 
Rubber 120 181 1,523 1,901 
Leather 316 385 1,283 1,284 
Stone 324 399 1,393 1,620 
Iron 663 1,709 1,594 2,007 
Nonferrous 320 396 1,464 1,951 
Machinery 1,514 2,943 1,557 2,193 
Auto 362 513 1,674 2,359 











8 From the 1940 Census of Population, Occupational Characteristics, Table 13, we can compute the 
per cent of wage or salary workers in each occupational group that worked 12 months. Applying these 
percentages to the number of wage earners in the complete count (1940 Census of Population, The 
Labor Force, Part I, Table 72), deducting the result from the total of experienced persons in the labor 
force 12 months, we have the number of non-wage or salary workers that worked 12 months in 1939. 

® Medians were computed by deducting the number of non-wage or salary workers who worked 12 
months from the total experienced labor force who worked 12 months, cumulating from the $0 income 
group. (Income distribution from The Labor Force, loc. cit.) T’a clustering premise is clearly inade- 
quate. Just as clearly its bias is not going to make much difference in the medians. 
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TABLE 4 


EARNINGS AND EMPLOYMENT BY MAJOR INDUSTRY GROUPS 
1919-23 and 1939-43 

















Average Average 
Full Time Equivalent Full Time Equivalent 
Employees Income 
1919-23 1959-43 1919-23 1939-43 
Mining 1,032 928 $1,696 $1,658 
Construction 1,245 1,594 1,482 1,787 
Trade 4,074 6,587 1,417 1,531 
Finance, insurance 996 1,344 1,809 1,860 
Service except 

domestic 2,137 3,174 948 1,079 
Domestic 1,785 1,942 598 637 
Transportation 2,655 2,273 1,536 2,007 
Utilities 570 915 1,251 1,826 
Manufacturing 9,078 13 ,335 1,377 1,764 
Government 2,901 5,153 1,210 1,621 














the same difference between earnings at different ages existed in 1920 
as did in 1939 we find about a 3 per cent rise in earnings because of 
changes in age composition.’® This 3 per cent rise is based on a 5 per 
cent rise because of changing age composition, 2 points of which were 
lost because of the increased importance of women in the labor force, 
women generally receiving lower incomes than men. 


TABLE & 
EMPLOYMENT (1920 AND 1940) AND EARNINGS OF FULL TIME 


EMPLOYEES (1939) BY OCCUPATION GROUP 


























Male Female 
Median Median 

— Full Time — Full Time 

Earnings of Earnings of 

Employees Employees 
1920 1940 1939 1920 1940 1939 

Proprietors and man- 

agers exc. trade 1,314 1,664 $2,600 78 193 $1,450 
Clerks 3.491 4.785 1,500 2,191 3,248 1,000 
Skilled workers 5,469 5,089 1,600 102 87 1,200 
Semi-skilled workers 4.371 6,190 1,250 2,260 3,072 950 
Laborers exc. farm 5,819 3.648 1,000 200 100 750 
Servants 675 1,300 1,000 1,570 2,346 400 

















10 The age distributions are based on those for non-agricultural gainful workers other than those 
in trade as given in the 1920 Census of Population, Vol. IV, p. 376, and the 1940 Census of Population, 
Vol. III, Pt. I, pp. 197, 199, Earnings by age and sex are from 1940 Census of Population, Wage and 
Salary Incomes in 1939, p. 107. 





co > 


> of co 


rea Pes 3 oo 








EARNINGS OF NONFARM EMPLOYEES 81 


It is not possible simply to add the 1.6 point rise in earnings because 
of changing industrial composition, the 3.0 per cent rise because of 
changing age composition and the gain, say of 0.5 per cent, because of 
varying occupational composition. This is so partly because the figures 
may not he exactly compared, partly because the changing industrial 
composition overlaps the age change, and so on. We must therefore be 
content with the understanding that of the 22 per cent gain in full time 
earnings something upwards of three-quarters arose because of a rise 
in earnings on a given job andia relatively small amount derived from a 
rising level of skill in the labor force or the rising level of maturity. 
(With the essential addendum that part of the rise which did take place 
was nullified by the increased role of women in the labor force.") 


II 


Comparable estimates of full time equivalent earnings by industry 
for the years 1919-1943 are given in Tables 6 and 7. In terms of full 
time earnings as, Chart I emphasizes, the various manufacturing indus- 
tries retained their general position with reference to one another. over 
the period 1919-23 to 1939-43. Petroleum and coal products manu- 
facturing was the highest paying industry in both quinquennia while 
tobacco products was the lowest in one, next to the lowest in the other. 
The sharpest shift was that of printing, which fell from 2nd to 7th 
place, while nonferrous manufacturing rose from 7th to 4th. The 
similarity of structure is likewise evident in data for major industrial 
groups (Chart ITI), although it is nowhere near as neat.” 

In general the per cent changes in income were related to the changes 
in employment as Chart III suggests for manufacturing industries and 
as the freehand curve in Chart IV less reasonably suggests for major 
industrial groups. It is clear that there is a close relationship between 
changes of income and employment in manufacturing, a much rougher 


11 Inasmuch as the earnings data used for the occupational and age calculation were from the 1940 
Census it might be noted that by also using median earnings data by industry for 1939 from tk< -ensus 
(Wage and Salary Income in 1939, pp. 146-147) in place of the 1939-43 full time equivalent « -erages, 
one would get a rise one-half of one per cent greater. 

The reader may not unreasonabiy be surprised by the small gains attributable to occupational 
and industrial shifts. Several factors might be noted. (1) These data apply only to nonfarm employees; 
the rising importance of service industries and the declining importance of agriculture is therefore irrele- 
vant. (2) A rising proportion of women in the labor force has tended to limit the rise in the average wage. 
(3) Because in point of fact a rise ascribed to a change in industrial composition only came about through 
the operation of changes in age and occupations as well it will not do to emphasize one factor to the 
exclusion of the others. 

12 A more extended discussion appears in the writer's “Wage Structures,” Review of Economic 
Statistics, November, 1947. 
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CHART |. 
EARNINGS AND EMPLOYMENT BY MANUFACTURING INDUSTRY, 
1919-23 AND 1939-43 
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one for major industries."* Whether this contrast arises from the fact 
that productivity changes in manufacturing industries tended to occur 
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TABLE 6 


AVERAGE ANNUAL EARNINGS PER FULL TIME EMPLOYEE 
Major Industry Groups 









































Wholesale | Finance, Communi- 
Contract = Service 
Year Stes | Cuter Manu- and Insurance Transpor- cation and Domestic eneegt 
tion facturing | Retail | and Real | tation Public Service Senet 
Trade Estate Utilities 
(in| dollars) 
1919 1,427 1,394 1,304 1,397 1,578 1,403 1,087 510 1,103 
1920 1,758 1,709 1,532 1,436 1,757 1,694 1,267 578 1,201 
1921 1,807 1,386 1,345 1,361 1,865 1,571 1,300 620 1,201 
1922 1,657 1.305 1,295 1,419 1,939 1,501 1,290 633 1,197 
1923 1,831 1,614 1,410 1,472 1,904 1,513 1,313 647 1,219 
1924 1,738 1,621 1,431 1,462 1,954 1,529 1,383 668 1,242 
1925 1,620 1,656 1,454 1,539 2,010 1,553 1,390 681 1,252 
1926 1,655 1,664 1,478 1,583 2,022 1,569 1,432 684 1,267 
1927 1,630 1,707 1,503 1,522 2,033 1,579 1,443 690 1,324 
1928 1,570 1,717 1,535 1,546 2,059 1,609 1,474 688 1,313 
1929 1,526 1,674 1,543 1,597 2,090 1,642 1,474 701 1,371 
1930 1,424 1,526 1,488 1,568 2,001 1,610 1,497 650 1,373 
1931 1,221 1,233 1,369 1,497 1,886 1,549 1,514 561 1,526 
1932 1,016 907 1,150 1,318 1,687 1,373 1,438 477 1,210 
1933 990 869 1,086 1,187 1,591 1,334 1,351 442 1,119 
1934 1,108 942 1,153 1,232 1,635 1.393 1,426 455 1,129 
1935 1,154 1,027 1,216 1,281 1,668 1,492 1,486 467 1,150 
1936 1,263 1,178 1,287 1,299 1,747 1,582 1,522 487 1,181 
1937 1,366 1,278 1,376 1,356 1,819 1,644 1,601 536 1,215 
1938 1,282 1,193 1,296 1,357 1,762 1,676 1,674 506 1,220 
1939 1,367 1,268 1,363 1,365 1,761 1,723 1,692 520 1,235 
1940 1,388 1,330 1,432 1,391 1,754 1,754 1,718 533 1,240 
1941 1,579 1,638 1,653 1,491 1,805 1,888 1,766 578 1,291 
1942 1,795 2,194 2,023 1,626 1,918 2,181 1,881 678 1,400 
1943 2,160 2,505 2,350 1,804 2,071 2,491 2,075 876 1,567 
1944 2,499 2,602 2,517 1,965 2,202 2,677 2,248 1,080 1,724 
1945 2,618 2,612 2,525 2,134 2,365 2,732 2,416 1,230 1,839 
1946 2,677 2,581 2,512 2,392 2,567 2,937 2,560 1,328 2,032 





in unison and proportion unlike those in nonmanufacturing, or whether 
it arises from still other factors cannot be sufficiently dealt with in the 


compass of the present article. 


38 The relationship between income and employment changes in manufacturing, and these changes 
in relation to productivity gains, has been ably discussed in Solomon Fabricant’s Employment in 


Manufacturing, 1899-1939, 1942, especially pp. 100-113. 
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To establish the basic series for real earnings of nonfarm employees 
three main constituent series are required: one for full time equivalent 
earnings, a second for per cent of time lost by unemployment and a 
final series for changes in the cost of living. The derivation of each of 
the required series will be discussed in turn. 


Full time equivalent earnings. Series for the 1929-1940 period are 
available from the Denison estimates for the Department of Com- 
merce. For 1919-28 estimates have been made by Kuznets, but these 
estimates are not comparable with the Denison figures.'® The series 
were examined for the overlap period to determine the best method of 
linking. For all groups the similarity of movement was obvious, but 
the levels differed in most instances. After a number of necessary re- 
groupings of industries had been made,'* the various series were then 
linked on the basis of the regression relationship for the period 1929—' 
38.17 In a number of instances the computed Commerce-level estimates 
were then ratio linked to the Commerce estiinates for 1929 to prevent 
a reversal or exaggeration of the 1928-30 movement. 

This 1919-43 series was then extrapolated to 1910 ky the relationship 
between this series and that of W. I. King, in the overlap period.'*- 
This series in turn was further extrapolated to 1890 by the 1910-26 


4 Survey of Current Business, National Income Supplement, July 1947. 

18 Simon Kuznets, National Income and Its Composition, 1919-1938 (1941). 

18 The following adjustments in Kuznets’ series were made: hosiery and knit goods added to textile 
mill, deducted from apparel; leather footwear deducted from apparel, added to leather; rubber footwear 
deducted from apparel, added to rubber; heating apparatus added to iron and steel; restaurants added 
to trade, deducted from service. 

17 Lumber and timber basic products; printing and publishing; iron and steel were considered to be 
directly comparable. 

The absolute differences between the Kuznets and Denison series are for the most part smal] enough, 
the similarity of movement great enough, that it is even possible to splice the series by s simple ratio 
link. Except for these three groups, however, regressions were used. 

18 W. I. King, The National Income and Purchasing Power, 1930, pp. 56, 60, 122. Bowden has re- 
vised some of the basic data which King relied on, but the use of these substitute figures makes little _» 

2 difference in the final estimates here presented. Cf. Witt Bowden, War and Post-War Wages, Prices 
nd Hours, 1914-23 and 1939-44, BLS Bulletin 852. Additional data of Bowden’s, for earlier years, 
! “appear ir the Monthly Labor Review, September, 1940. 

19 King’s figures have been adjusted by Kuznets with a view to comparability (Kuznets, op. cit., 
Vol. II, pp. 469-474) but these adjustments do not affect our present concern. While King’s implicit 
average compensation figure for nonfarm employees is $1,220 in 1919 and $1,479 in 1920, the estimates 
here presented are $1,260 and $1,448 respectively. King’s figures were therefore assumed to be directly 
comparable with the present estimates. The 1919-43 series could, of course, have been extrapolated all 
the way back to 1890 by Douglas. But instead of such a procedure, based on the 8-year overlap from 
1919-26, it seemed preferable to use King’s figures back to 1910, thus producing an independent series 
which could be related to the Douglas series for the 17-year period, 1910-26. 











TABLE 7 (PART 1) 


AVERAGE ANNUAL EARNINGS PER FULL-TIME EMPLOYEE 
Manufacturing Industries 






































1919-1946 
Aggesd Lumber | Furniture Printing, . 
Food Tobacco | Textile- and and and Paper Publish- Chemicals 

Year | *74 | Manu- | Mit | Other | Timber | Finished | *°4 | and | 20d 

Kindred Finished - Allied Allied 
factures | Products ; Basic Lumber Allied 
Products Fabric Prod Products - | Products 
ucts | Products Industries 
Products 
(in dollars) 

1919 1,319 942 1,013 1,237 1,087 1,151 1,236 1,320 1,467 
1920 1,470 1,083 1,226 1,405 1,376 1,308 1,461 1,666 1,780 
1921 1,448 950 1,052 1,284 1,013 1,282 1,320 1,670 1,531 
1922 1,386 950 1,001 1,275 942 1,258 1,308 1,650 1,499 
1923 1,437 988 1,110 1,322 1,094 1,317 1,375 1,737 1,556 
1924 1,468 981 ' 1,099 1,318 1,107 1,345 1,402 1,812 1,581 
1925 1,460 1,001 1,114 1,343 1,121 1,359 1,421 1,868 1,618 
1926 1,484 1,006 1,119 1,331 1,135 1,385 1,452 1,964 1,627 
1927 1,510 976 1,151 1,364 1,155 1,386 1,470 1,969 1,644 
1928 1,516 965 1,141 1,362 1,164 1,379 1,501 2,011 1,686 
1929 1,503 979 1,155 1,361 1,172 1,398 1,514 2,010 1,673 
1930 1,489 985 1,096 1,265 1,156 1,310 1,487 2,011 1,647 
1931 1,451 908 1,039 1,162 1,010 1,196 1,404 1,943 1,608 
1932 1,303 787 847 941 787 ~ 962 1,028 1,740 1,419 
1933 1,204 725 829 900 737 900 1,143 1,599 1,312 
1934 1,221 750 883 987 791 948 1,186 1,644 1,341 
1935 1,253 778 926 1,016 833 988 1,235 1,698 1,385 
1936 1,290 817 952 1,013 911 1,074 1,313 1,702 1,455 
1937 1,351 883 994 1,025 963 1,123 1,403 1,722 1,559 
1938 1,331 870 926 999 940 1,102 1,359 1,697 1,621 
1939 1,372 916 960 1,025 956 1,138 1,414 1,718 1,611 
1940 1,385 1,000 986 1,022 934 1,158 1,458 1,764 1,723 
1941 1,472 1,117 1,159 1,159 1,026 1,304 1,646 1,852 1,893 
1942 1,650 1,240 1,385 1,330 1,204 1,514 1,850 1,973 2,139 
1943 1,879 1,431 1,556 1,595 1,449 1,743 2,076 2,158 2,386 
1944 2,044 1,580 1,681 1,788 1,564 1,892 2,254 2,376 2,608 
1945 2,176 1,693 1,814 1,944 1,617 1,983 2,363 2,577 2,683 
1946 2,392 1,798 2,037 2,168 1,781 2,203 2,547 2,871 2,752 





relationship between it and the Douglas series.”° 

Time lost in unemployment. Given a series for full time earnings our 
next requisite is a parallel one for time lost in unemployment by non- 
farm employees so that the earnings series can be transformed from a 
full time equivalent one into an actual average series. Total unemploy- 
ment estimates for the 1929-40 period have been made available re- 
cently by the Bureau of Labor Statistics.24 Comparable but unpub- 


20 Douglas, Real Wages, Table 147, p. 392, average earnings in all industries excluding farm labor. 
Douglas notes that his series represents the movement of earnings of 73 per cent of those who worked 
for wages or salaries in 1920. (Op. cit., p. 389.) Without attempting to revise his estimates on the basis 
of later data it is nonetheless clear that his series includes the bulk of nonfarm workers and hence can 
be readily used for extrapolating a nonfarm earnings series. 

21 Cf. The writer’s (Bureau cf Labor Statistics, Occupational Outlook Division, July 1945), Prelim- 
inary Estimates of Labor Force, Employment and Unemployment, 1929-1940. 
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Machinery 
Products Ston Tron and Non- T and ° — 
. . Leather jun Steel and | ferrous por - ry Total 
na 2 Rubber | and - Their | Metals | “i ao An | y 
etro’eum | Products | Leather Products and aquip- — Manu- — 
and Glass . ‘ 2 ment bile - 
Coal Products Sentedte (including | Their Gnas Equi facturing 
a Ordnance) | Products - ot 
Auto- ment 
mobiles) 





(in dollars) 
1,671 1,446 1,231 1,271 1,599 1,360 1,493 1,531 1,304 1919 
2,224 1,759 1,377 1,528 1,882 1,596 1,700 1,818 1,532 1920 
1,808 1,440 1,273 1,397 1,489 1,414 1,543 1,668 1,345 1921 
1,725 1,442 1,258 1,308 1,377 1,434 1,446 1,617 1,295 1922 
1,794 1,530 1,274 1,460 1,622 1,516 1,605 1,735 1,410 1923 
1,782 1,561 1,264 1,536 1,633 1,580 1,629 1,720 1,431 1924 
1,804 1,542 1,278 1,519 1,652 1,588 1,658 1,795 1,454 1925 
1,821 1,573 1,287 1,519 1,667 1,639 1,678 1,786 1,478 1926 
1,832 1,619 1,303 1,546 1,677 1,636 1,747 1,809 1,503 1927 
1,858 1,633 1,296 1,568 1,738 1,675 1,754 1,864 1,535 1928 
1,844 1,597 1,327 1,557 1,740 1,665 1,756 1,813 1,543 1929 
1,904 1,563 1,215 1,525 1,640 1,554 1,715 1,571 1,488 1930 
1,810 1,392 1,152 1,386 1,410 1,455 1,507 15455 1,369 1931 
1,619 1,191 970 1,167 1,044 1,177 1,283 1,234 1,150 1932 
1,505 1,137 950 1,071 1,073 1,132 1,245 1,170 1,086 1933 
1,513 1,248 1,017 1,088 1,166 1,209 1,320 1,314 1,153 1934 
1,587 1,358 1,043 1,171 1,295 1,277 1,400 1,489 1,216 1935 
1,629 1,472 1,045 1,262 1,446 1,361 1,520 1,600 1,287 1936 
1,833 1,526 1,085 1,357 1,591 1,492 1,658 1,672 1,376 19387 
1,863 1,457 1,017 1,303 1,359 1,402 1,538 1,653 1,296 1938 
1,852 1,548 1,038 1,359 1,549 1,521 1,653 1,762 1,383 1939 
1,954 1,583 1,041 1,393 1,643 1,594 1,767 1,934 1,432 1940 
2,113 1,778 1,236 1,554 1,923 1,824 2,091 2,243 1,653 1941 
2,410 2,116 1,447 1,771 2,284 2,235 2,592 2,880 2,023 194 

2,806 2,478 1,659 2,024 2,637 2,581 2,862 2,978 2.350 1943 
3,046 2,699 1,831 2,174 2,781 2,724 3,022 3,103 2,517 1944 
3,092 2,715 1,969 2,252 2,806 2,741 2,994 2,984 2,525 1945 
3,180 2,826 2,123 2,394 2,687 2,710 2,808 2,796 2,512 1946 
































lished figures for 1920-29 are likewise available.” These data consti- 
tute a logical starting point. 

Before this series can be used as a measure of time lost by nonfarm 
employees, however, we must consider to what extent the unemploy- 
ment of farm hands and self-employed persons is included in that 
series. If it can be established that unemployment in those groups can 
be balanced off against the failure of the series to allow for the full 
measure of time lost by nonfarm employees, then the series can be 
used as is. 


22 Described by the writer in a forthcoming BLS technical memorandum. 











88 AMERICAN STATISTICAL ASSOCIATION 


On balance it may be assumed that the nonfarm self-employed con- 
tributed nothing to the unemployment totals. The small number of 
nonfarm self-employed in April 1940, for example—roughly 0.2 out of 
4.4 million experienced workers seeking work—was more than com- 
pensated for by the number of disemployed workers who, turning to 
self-employment, had disguised their unemployment in that form.” 

Agricultural workers—operators and hired hands—constituted about 
4 per cent of total unemployment in April 1930 and about 8 per cent 
of the experienced unemployed in March 1940.74 More than compen- 
sating for the inclusion of agriculturai unemployment is the exclusion 
of two groups of unemployed. First is the group of women and older 
workers who, after being without work for some time, withdraw from 
the labor force.** Second is the group of workers who migrated to rural 
areas because of lack of work in nonfarm pursuits, the relationship be- 
tween farm-nonfarm migration and the level of employment being 
well known. It may be added that even if we were to ignore these two 
groups and allow in full for agricultural unemployment, the resultant 
estimate for 1940 would be 18.3 per cent of time lost rather than 19.9. 

Granted these various considerations, therefore, it was decided to 
take the level of unemployment shown by the BLS estimates as equiva- 
lent to the level of unemployment among nonfarm employees.” The per 
cent of time lost is computed as the ratio of unemployment on the one 
hand to employed nonfarm employees plus unemployed on the other. 

Given unemployment estimates for 1920-43 we must next derive 
serviceable estimates for prior years. National unemployment enumera- 
tions were conducted and data published in connection with both the 


23 1940 Census of Population, Employment and Personal Characteristics, Table 11, with a deduction 
for unemployed farm operators (1940 Census of Population, Occupational Characteristics, Table 6). 
A discussion of the relatively small decline in self-employment appears in the BLS, Estimates of the 
Non-agricultural Self-Employed, 1929-1940 (1945). 

* 1930 Census of Population, Unemployment, Vol. I, pp. 53-54, Class A and B unemployment. 1940 
Census of Population, The Labor Force, Vol. III, Table 59; data in Table 1 for rural farm areas allow 
us to take account of new workers and public emergency workers but necessarily i: ‘ude some nonfarm 
workers. The rural farm ratio is 11 per cent. It should be noted that some 20 per cent of the experienced 
persons seeking work in March 1940 who reported their last occupation as farmer or farm operator did 
not report their usual occupation in the same group; this contrasts with the over-all male average of 
10 per cent. (1940 Censusof Population, Usual Occupations, Tables 11 and II.) There is likely to be some 
inflation in agricultural unemployment, therefore, precisely because workers have exported their un- 
employment from cities to rural areas, attempting to become farm workers in the process. On the other 
hand, some unemployed farm workers undoubtedly sought work in the cities. 

% Gladys Palmer has called attention to “the substantial increase we found in Philadelphia in the 
number of households reporting no employable member as the depression continued.” (With respect to 
the 1929-40 period, the poini is discussed at greater length in a forthcoming BLS Bulletin.) 

%* Given a continuing substratum of disguised unemployment these percentages are probably some- 
what too small. More important for present purposes is the fact that the undoubted gain in the impor- 
tance of such unemployment during the depression is not reflected in the estimates of time lost. As & 
result the gain in nonfarm earnings over the years is exaggerated. 
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1890 and the 1900 Census of Population. However it does not now ap- 
pear possible to make any use of the 1890 estimates.”’ An estimate for 
1900, however, can be computed from the Census data after certain 
adjustments are made.”® 

(1) The first such adjustment is the exclusion from both the gainful 
worker and the unemployment totals of those workers who do not come 
within the purview of the post-1940 “labor force” concept. For male 
workers it is possible to make such exclusions with at least approximate 
accuracy, utilizing for the purpose the 1901 Cost of Living survey.”* 

(2) Since the 1900 Census distributes the unemployed into three 
unemployment duration groups it is then necessary to estimate the 
average duration for each group before total time lost in unemployment 
by males can be estimated. As the 1940 Census data indicate it is not 
possible to secure an accurate average for a group as large as 7 to 12 
months of unemployment merely by taking the mid-point. However, 
the 1901 Cost of Living survey gives a distribution into 23 duration 
groups and it is thus possible to calculate an average duration of un- 
employment for men in the nonagricultural labor force in 1899-1900.*° 

(3) In step one the appropriate percentage of total male unem- 
ployed nonagricultural gainful workers in 1899-1900 was deducted in 
order to ensure comparability with later data, using proportions from 
the 1901 Cost of Living survey. In step two medians were computed 
for each of the three unemployment groups after they had been thus 
adjusted. In step three a simple multiplication gave the estimated 
total weeks lost in 1899-1900 by the male nonagricultural labor force. 
After corresponding adjustments in the gainful worker figures to give 
total labor force weeks for this group it was possible to compute the 
per cent of time lost by it—4.93.4 


7 The 1890 data, unlike those for 1900 and later years, apply chiefly to unemployment at the 
worker’s primary job rather than to total unemployment. Furthermore there is clear evidence that the 
1900 Census enumerators operated with a better understanding of what they were seeking, and hence 
obtained estimates which were generally higher than those of 1890 merely because the enumeration tech- 
nique differed. 1890 Census of Population, Part II, p. exxvi. 1900 Census of Population, Occupations, 
Pp. cexxvi-cexxxili. 

28 1900 Census, op. cit., pp. 76-77. 

29 Eighteenth Annual Report of the Commissioner of Labor, 1903 (1904), pp. 290-291. It was as- 
sumed that those workers whose unemployment was ascribed to the following causes would have been 
excluded by 1940 definitions: old age, sickness, strike or vacation. On the other hand those for whom 
any of these causes was reported in combination with another cause were included. This was done be- 
cause enumerative procedures in current use undoubtedly include some such workers in practice, and 
definitely include many who are temporarily il] and without a job. 

30 Durations, of course, were computed for the groups of workers remaining after the excluded 
workers had been deducted. Data on those unemployed for 52 weeks were taken from page 26 of the 
Report, while all other data appear on pages 290-29]. 

31 Both teachers and manufacturers were excluded from these computations since according to 
current practice few if any would be reported as unemployed. 
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(4) This percentage is very substantially below an estimate of 9.4 
per cent to be derived from the Cost of Living Report. A major reason 
for the divergence is the fact that according to the Report 50 per cent 
of male urban heads of families had some unemployment during a 
year ending in 1901 while the Census figure for males except in agricul- 
ture was 23 per cent. 

It seems reasonable to believe that in this respect the Cost of Living 
report is the more accurate measure. In such surveys, where income was 
reconciled with expense on a detailed basis of reporting, there was less 
likelihood that short period unemployment would be forgotten than 
there was in the general purpose population enumeration, which could 
necessarily have little check on the solidity of the unemployment re- 
ports in the absence of questions on income or related factors. It is 
therefore assumed that the Census figure is an underestimate. An up- 
ward adjustment was made by doubling the number in the lowest 
duration group, exclusive of those in agriculture, teachers and manu- 
facturers. This step raised the 1899-1900 percentage to 8.3—or some- 
what below the adjusted 1901 figure of 9.4. 

This divergence was ignored because the unemployment figures re- 
ported to the Census for nonfarm women are likely to be too high.* 
Combining the estimate of 8.3 per cent with the parallel figure for fe- 
male workers an over-all percentage of 7.5 was finally secured.* This 


percentage then serves as a benchmark figure for 1900, comparable 


with the BLS 1920—40 estimates. 
If it were possible to utilize the well known estimates of Douglas 
for the intervening years then a complete unemployment series could 


32 The percentage to be derived directly from the Report is 6.8. However, this must be adjusted to 
allow for self-employed and for single persons—both groups being included in the Census data. In 1940 
there were 7 male nonagricultural employees to 1 employer or own-account worker. (1940 Population 
Census, III, p. 1.) Though trade, service and construction together made up 8 much smaller fraction 
of the nonfarm total in 1940—as Carson’s data demonstrate—we can nonetheless use the 1940 propor- 
tion and thus intentionally underestimate the spread between the Report and the Census estimates. 
Assuming a 0 rate of unemployment for the self-employed we find the adjusted 1901 rate is 6.0. However 
by assuming that the unemployment rate for urban single workers in 1900 was the same proportion of 
the rate for urban married workers in 1900 as it was in 1940 (1940 Census of Population, III, p. 22) 
the head of family rate as adjusted from the 1901 Report was used to construct a single workers rate. 
The two were then weighted by the distribution of nonagricultural males by marital status (1900 
Census, Occupations, p. 52) with the result that the 6.0 figure rises to 9.4 per cent. 

33 Report, p. 287. Occupations, pp. 7, 76. 

* A comparison of unemployment rates by occupations shows that female rates for various occupa- 
tions were above male rates in 1900, instead of almost uniformly below them as in 1930 and 1940. Since 
women sre not as continuously in the labor force as men are, what is reported as “‘unemployment” 
in some instances is really “‘non-employment,” or time not in the labor force. (This factor was adjusted 
for in the male data on the basis of the 1901 Report.) 

% Data for agricultural unemployment are excluded since the chief reasons which dictated their in- 
clusion in later years do not apply to the 1900 data. Their inclusion would only reduce the 1900 earnings 
estimate by $21. 
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be calculated. Douglas’s absolute estimates, however, cannot be used 
as is. If one follows his procedure of comparing the labor supply 
(estimated from Census of Population data) with employment (es- 
timated from payroll sources) but uses the revised labor supply data 
of Daniel Carson and the revised employment estimates of the BLS 
one arrives at sharply different results for a test year.** While the ab- 
solute figures in Table 8 may not be compared the percentages can be. 


TABLE 8 


UNEMPLOYMENT IN SELECTED INDUSTRIES IN 1920 AS ESTIMATED 
FROM DOUGLAS'S, GIVENS’ AND REVISED DATA 











— and transporta Labor Supply Employment “an pd 
Carson—BLS 14,401 14,492 —0.6 
Dovglas—Unadj.* 10,732 10,628 1.0 
Givenst 15,418 14,761 4.3 

Manufacturing, transportation, 

construction 
Present estimate 15,521 15,348 1.1 
Douglast 12,221 11,478 6.1 





* Manufacturing plus steam railroads plus street railways. Data prior to arbitrary adjustment to 
Givens’ unemployment percentage 

+ Includes hand trades in labor supply and possibly in employment as well. 

t After adjustment of manufacturing and transportation data to Givens’ unemployment per- 
centage. 


The percentage of manufacturing and transportation combined is 
negative and by any reasonable interpretation is valueless. When 
construction is added the percentage becomes a positive one, but its 
absolute value may still be in question. However the series may be 
reasonably indicative of the changing movement of unemployment. 
Given benchmark data for 1900 and 1920 the Douglas data may be 
used for interpolation. A regression of the present series for per cent 


% Douglas’ data from Real Wages, Chapter 24. The Carson estimates are from his Industrial 
Composition of Manpower in the United States, 1870-1940 (November 1946). 

The Jabor supply in constructicn was not estimated from the Carson data since he computes the 
number of construc:ion laborers in 1920 by apolying the 1930 ratio of laborers to number of workers in 
selected construction trades. However an unduly high proportion of construction workers in 1920 re- 
ported themselves in the skilled trades, presumably calling themselves carpenters, painters, etc., as a 
result of work in war construction. The labor supply was therefore estimated by applying to the BLS 
employment estimate the unemployment percentage used by Douglas (p. 451). 

The Givens’ estimates are from Recent Economic Changes (1929), Vol. II, pp. 466-478. For most 
of his 1920 unemployment percentages Douglas uses the Givens’ estimates. 

NICB estimates are likewise available for this period, and indeed back to 1900. Economic Record, 
March 20, 1940. These data for the 1900-1920 period, however, seem of questionable value. The 1920 
unemployment percentage, for example, is the resultant of a combination of percentages for each in- 
dustry—the percentages secured by interpolating between a 1900 and a 1930 percentage, according to a 
communication from the NICB. The 1920 figure, however, is likely to have been less than either, and 
this distortion serves to distort the entire 1900-30 level of the estimates. 
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of time lost 1920-26 against that of Douglas’ series (for manufacturing, 
mining, construction and rail transport) was therefore computed.*” 
This indicated a sufficiently close relationship so that the present series 
could be carried back to 1890. The resultant estimated series was then 
used to interpolate between the 1900 benchmark and the 1920 es- 
timate.** The consideration of such related materials as the trend of 
deflated gross product, unemployment among union labor in New York 
and the qualitative data in Thorp’s Business Annals tended to confirm 
the verisimilitude of the movement thus indicated.** 

The per cent of time lost for 1940-46 can be readily computed from 
the Bureau of Census data, thus providing a full series for the period 
1890-1946.“ By applying this series to that previously computed for 
full time earnings we secure the final estimates for the actual earnings 
of nonfarm employees. The nominal earnings series together with one 
adjusted for changes in the cost of living appears in Table 2.4" 


* * * 


It may not be without interest to compare the movement shown 
in Table 2 with Douglas’ figures for the period when the two full time 
earnings series are independent: 1910 to 1926. Thus Douglas’ full time 
earnings figure for nonfarm wage earners rises 126 per cent from $652 


37 Douglas, op. cit., p. 460, presents estimates for the 4 specified industries for 1897-1926. His series 
was extrapolated to 1890 by its relationship, over the 1897-1926 period, to his series for unemployment 
in manufacturing and transportation (ibid., p. 445). 

3% The 1900 value estimated from the regression was 7.9 per cent, as compared to 7.5 from the Cen- 
sus materials. 

39 The relationship between unemployment percentages and per cent deviations of gross national 
product from trend is dissussed in the forthcoming BLS memorandum on the 1920-29 estimates. As 
indicated there the relationship is not such that one can assume the relationship between the two series 
during the 1920’s had the same slope as that in earlier years. Hence it could not be assuredly used for 
extrapolation. However the fact that the slope was like the 1920's, and not the 1930's, did tend to confirm 
a relationship otherwise arrived at. 

For the New York data, cf. BLS Bulletin No. 109, Statistics of Unemployment and the Work of 
Employment Offices, 1913, p. 18. 

4° Bureau of the Census, Labor Force, Employment and Unemployment in the United States, 1940- 
1946 .(1947). 

41 The cost of living index is the regular BLS series extrapolated by a regression on Douglas’ series 
(op. cit., p. 60). it would have been possible to make some allowance for the BAE series on prices paid by 
farmers for food, prices paid for house furnishings etc.—giving these some weight as representative of 
prices paid by rural nonfarm workers. However, the aggregate weight of this group is relatively small 
and the differences in movement between BAE and BLS price series relatively restricted. Therefore the 
BLS series was used as is. 

# The actual earnings series developed here was inflated by the ratio of total compensation to total 
wages and salaries for the 1920-43 period. The relevant 1920-29 data appear in Kuznets, National 
Income and Its Composition, Vol. 1, p. 216. The later figures are from the Survey of Current Business, 
July 1947. For the earlier years no adjustment is necessary, King's figures for total compensation 
having been used initially. 
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in 1910-14 to $1,473 in 1926.“ The present estimate, is necessarily 
higher in 1910-14, being for wage earners plus salaried workers, but its 
rise to $1,458 in 1926 shows an increase of 108 per cent. In-part the 
difference in movement comes about because Douglas’ series, excluding 
as it does construction, trade, finance and other groups, is not adapted 
to allow for the effect of changes in the industrial, occupational and sex 
composition of the labor force which result when these groups are 
included. But discussion of the difference should not be allowed to 
obscure the fact that the present estimate of change over this period is 
within 15 per cent of Douglas’—a variance easily explained by the 
factors noted. 

The change in unemployment percentages over the 1910-1926 pe- 
riod is, of course, almost identical in magnitude, the movement of 
the present series being based on Douglas’ for most of these years. 
(Douglas’—for manufacturing, mining, transport and construction— 
falls by 22 per cent, while the present one for all nonfarm declines by 
21 per cent.) It is hardly surprising, therefore, that the net change in 
real earnings which Douglas shows for those 4 industries is very similar 
to that shown for all employees in the present estimates, 133 per cent 
in one instance, 112 per cent in the other.“-*5 


4 Douglas, op. cit., Table 147, p. 390. 

« Ibid., pp. 460, 468. With further reference to the similarity, in a general sense, it should be noted 
that materials on full time earnings and employment drawn upon for use in this present study derive in 
no small part from the prodigious and invaluable labors of Douglas. 

4 The absolute levels of unemployment percentages are, of course, not comparable: Douglas’ 
apply to only part of all wage earners, albeit the major part. As he has emphasized there is no separate 
labor supply for manufacturing since there is an interchange between ite workers and those in construc- 
tion, transport, etc. (ibid., p. 602). It follows, therefore, that unemployment percentages should be 
computed over as wide a base as possible—-preferably all nonfarm workers, as is done in the present 
study. ‘ 








DIRECT DETERMINATION OF COMPASS SETTINGS 
FOR PROPORTIONATE AREA PIE-CHARTS* 


HERMAN LASKEN 
Inglewood, Calif. 


This note presents a chart which can be used by draftsmen 
to determine directly from raw area data compass settings for 
plotting proportionate area circles without any of the usual 
square-root and proportioning computations. 


VoIDING here any discussion of the validity, from a purely statis- 
tical standpoint, of proportionate area pie-charts as a means of 
presenting statistical data, the fact remains that such charts are fre- 
quently used and that their preparation involves some computations. 
Regardless of the order in which the computations are made, the de- 
termination of the compass settings which will achieve proportionate 
areas of circles requires extraction of square roots and division of all 
such square roots by the square root of the circle taken as the standard 
of measurement for the particular chart. 

The accompanying chart is designed to eliminate all such computa- 
ions in the drafting of pie-charts. Following the order of the instruc- 
tions which are included on the chart: 

Step 1 reduces the given area data to two digits as a maximum, 
which is more than sufficient accuracy for this type of chart. 

Step 2 amounts simply to a square-root computation. The heavy 
curved line is the curve of square roots corresponding to the area scale, 
with the vertical scale set at values of zero to ten. (The scale is omitted 
from the chart as it is unnecessary for the purpose of the chart and 
simply adds confusion.) 

Steps 3 and 4 determine the compass setting for the largest circle 
on the chart and automatically select the oblique lettered line to be 
used for all remaining compass settings. 

Step 5 instructs the draftsman to draw his biggest circle, with his 
compass as set in step 4. 

Step 6 provides for similar settings and plottings for the remaining 
areas. All settings are to be made from the zero line to the oblique line 
determined as above. 

The original chart is a ten-inch square, principally for convenience 
of use. However, its use is not affected by changes in size. 


* The term “pie chart” as used in this note, refers to circle charts, both whole and segmented. 


94 








STANDARD CHART FOR DIRECT DETERMINATION OF COMPASS 
PROPORTIONATE AREA PIE~CHARTS 
VASTRUCTIONS FOR USE 


SETTINGS FOR 


POLST OFF GIVER AREA DATA TO TWO PLACES FOR THE LARGEST VaLeE. 
DETERMINE POINT OM CURVE CORRESPONDING TO LARGEST AREa, 


SET OWE POINT OF COMPASS OW ZERO LIME ROLIZONTAL TO POIRT 30 DETERMIBED. 


se ése = 
Py 7 . . 


SET SECOND COMPASS POINT ALONG SANE MORITONTAL OM LETTERED LIME NEAREST RaBiUS OF 
LARGEST CIRCLE DESIRED IM VIEW OF GEWERAL SIZE AND LAYOUT OF CuaRT, 


SOTING LETTERED LIME SO DETERMINED, PLOT CIRCLE Om CHART WITH Compass a8 SzT. 


ALL OTMER SETTINGS ARE THEM MADE FROM THE ZERO LIME TO THE SAME LETTERED Lime ALone 
BORIZONTALS DETERMINED FROM THE POINTS OM THE CURVE CORRESPONDING TO THE GIVER aReas. 
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Experience with the chart shows that most draftsmen require a single 
demonstration of its use. Once familiar with it, they use it with facil- 
ity and eliminate both the usual computations and the possibility of 
error inherent in such computations. 











PROFILE GRAPHS 


Joun V. SPIELMANS 
Marquette University 


This article is concerned with certain graphical devices, 
called “Profile Graphs,” for the integrated presentation of 
data of several dimensions. The several types of such profiles 
(two-and three-dimensional, simple and composite, statistical 
and historical) are explained; their advantages and disad- 
vantages discussed; and illustrations given for the several 
types. 


HIS ARTICLE is concerned with some graphical devices for the pres- 
yer of statistical data of several dimensions. Although the 
device is not new! we feel justified in discussing it here, for two reasons. 
On the one hand, the method, its considerable merit notwithstanding, 
seems little known, or at least, little used.? On the other hand our analy- 
sis of the types of relationships which lend themselves to this kind of 
presentation serves, we believe, to widen the scope of the method. 

We begin with a brief discussion of the nature of multi-dimensional 
relations. By “dimensions” we mean data of distinct denomination 
which combine by multiplication into a further significant datum. A 
two-dimensional] relation is thus characterized by three data, one of 
which is the product of the other two: A =n-m; (or inversely, one of 
which is the quotient of the other two: m=A/n); a three-dimensional 
one by four data one of which is the product of the other three: 
A=p-q-m, and so on. 

Many basic relations in the field of economics, as well as in other 
fields, fall into this class. To enumerate a few: value equals quantity 
times price; payrolls equal number of employees times average pay; 
daily wages equal hourly rates times hours per day; man-days of work 
(or of strike) equal number of workers times average days of work (or 
of strike); passenger-miles of traffic equal number of passengers times 
average mileage, and so on. 

This variety of multi-dimensional relations can generally be reduced 
to a common type. In all such instances the data refer to classes of 
elements pertaining to given time intervals. In two-dimensional rela- 
tions the one factor, say n, represents the number of elements in the 

1 K. Karsten in Charts and Graphs, New York, 1925, p. 613 ff. describes certain aspects of it under 
the name Area Bar Charts, although, as we believe, without full appreciation of their significance. 
2 The present author hit upon a scheme of this kind in trying to give an integrated graphical 


presentation of strike data. See “Strike Profiles,” Journal of Political Economy, Dec., 1944, pp. 319 ff. 
He has not seen it used elsewhere. 
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class during the time interval in question; (number of unit goods pro- 
duced, or sold, or consumed; number of acres planted; number of per- 
sons doing a certain thing; number of certain events occurring per 
year, or per month); while the other factor m gives a numerical attri- 
bute of the elements (prices of goods, productivity of workers, or of 
acres; duration of actions or of events). The product A=n-m then 
represents the total value of the attribute pertaining to the class of 
n elements as a whole (total value; total product; total earnings; 
total man-days, etc.) 

In most instances, it is true, the attribute m is not constant, but 
varies among the n elements in the class. In that case the aggregate 
value of A is obtained, not simply by multiplying, but by the summa- 
tion of the m-values. 


A= ps ™m;. 
t=] 


In terms of this aggregate, however, we can define an “average” or 
“mean value” of m: 


m= — > m; = 
Nr tml 


LZ A 
= 2 
n 
wherefore A can again be presented as a product of two factors, A 
=n-M. 

Three-dimensional relations can similarly be reduced to a common 
type in terms of a class or classes of elements: the whole class in ques- 
tion contains p subclasses of q (or g) elements each, so that the total 
number of elements in the class is n= p-q. If m (or m) is the numerical 
attribute of the elements, then A=n-m=p-q-m is the aggregate 
m-value of the class as a whole. (For example, p acres of average pro- 
ductivity g bu/acre produce n=p-g bu; with the price m cts/bu, the 
total value of the crop for the given year is A=p-g-m cts.) 


GRAPHICAL REPRESENTATION OF MULTI-DIMENSIONAL RELATIONS 


The customary graphical representation of more-dimensional data 
is by way of multiple graphs: triplets of Jine or bar graphs in the same 
or in adjacent diagrams for the three data n, m, and A. (For example 
for quantity, price, and value of commodities, or for employment, 
average earnings, and payrolls, and the like.)* 


* Since the viewpoint in such presentations does not necessarily coincide with the one taken here 
such multiple graphs do not always show all, or not only, the (n +1) data belonging to an n-dimensional 
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Such multiple graphs, however, while showing clearly the data and 
their time variations separately, fail to give visual expression to the 
dimensional relation between them. 

In contrast the profile graphs represent all the data which enter 
into the relationship in a single, dimensionally integrated figure. 

A. Profiles of two-dimensional relations. To represent two-dimen- 
sional data one plots the two factors n and m respectively‘ as the width 
and height of a rectangle, whose area then represents the product 
datum A. Constructing such rectangles for successive time intervals, 
and placing them side by side, each with its proper “time label,” one 
obtains what we call a “statistical profile.” Such profile shows in a single 
figure the three time series n(t), m(t), and A(t) “dimensionally in- 
tegrated,” the datum A(é) appearing visually as the product resulting 
from the multiplication of the two distinct data n(t) and m(t). 

The economy of this presentation is due to the fact that both axes 
of the diagram are utilized for the two dimensions n and m of the rela- 
tion in question, while none is used as a “time axis.” Chronological 
time is merely represented by the time labels of the several rectangles. 

Against the advantages of this compact and integrated presentation 
stand certain disadvantages. First, the numerical values of the areas 
are less easily read and compared than in line or bar graphs showing 
the values of A separately. This disadvantage can be offset largely by 
writing the values of A inside the rectangles. (See Fig. 1.) Secondly, 
the time variations, not only of the areas, but also of the widths of the 
rectangles stand out less clearly than on graphs showing each variable 
separately. Again, the profiles are at a disadvantage in comparison to 





relationship. Thus the Bureau of Labor Statistics shows customarily in one diagram the indices of em- 
ployment and payrolls, without the implied average earnings, while the National Industrial Conference 
Board (Studies of American Wages) shows in one diagram the indices of employment and average earn- 
ings, without the total payrolls. Likewise Babson’s Business Reports shows in multiple line graphs the 
quantities and prices of commodities, but not the implied total values; whereas other presentations will 
show doublets of line or bar graphs for quantity and total values (e.g. of crops), without the implied 
price. Again one finds multiple graphs of data belonging not to one, but to several two-dimensional 
relations. Thus the N.I.C.B. in its Weekly Chart Service, Road Maps to Industry, shows in one single 
diagram 15 line graphs of manufacturing indices, all of which can be grouped into triplets of two-di- 
mensional data. On the other hand these and similar publications also show the exact triplets of two- 
dimensional data. For instance, the Road Maps to Industry show triplets of line graphs for indices of 
volume, price, and value of goods sold; of payrolls, employment, and average earnings; of number of 
strikes, workers involved, and average workers per strike, etc. Similarly in the Survey of Current Busi- 
ness (U. S. Department of Commerce) triplets of line graphs appear in adjacent diagrams for weekly 
hours, hourly earnings, and weekly earnings; for quantity, price and value of production. Since the 
latter part of 1937 the N.I.C.B., loc. cit., separatea the 15 indices of Manufacturing Changes into two 
groups called respectively Basic and Derived Indices. In our designation the former are the data n and 
A, the latter the data 7 =A/n. While the making of this distinction draws attention to the dimensional 
relation between the data no visual expression is given to them in the graphs. 
* In the following we are using m as synonymous with #, unless the distinction is essential. 
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FIGURE 1 


BITUMINOUS COAL PRODUCTION IN THE UNITED STATES, 1900-39 
FIVE-YEAR AVERAGES; INDEXES, 1929 =100 
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Note: Since the data presented in this figure vary on the whole but slowly a year-to-year profile would 
be impractical. But by presenting five-year averages a distinct and telling picture of the development 
es a whole results; the steadily inereasing productivity per man-hour, combined with an initially strong 
increase, and later decrease of man-hours, resulting, after a prolonged increase, in an eventual sharp 
decrease in total output. 





line graphs whenever the time variations of the data are only small per- 
centages of the whole.’ However, even under such circumstances it is 
often possible to obtain worthwhile profiles by constructing the several 
rectangles for larger time intervals. The objection has also been made 
that profile graphs are difficult to understand. This objection, we be- 
lieve, can best be met by proper legends on the graphs showing clearly 
the meaning, units of measurement, and scales for all the data involved. 

To summarize, the profiles are useful insofar as the advantages of 
the dimensional integration outweigh the several disadvantages. More- 
over, one can always show in separate diagrams any one of the variables 
which does not stand out with desired clarity in the profile. 

B. Profiles of three-dimensional relations. For relations of the type 
A=p-q-m one can utilize the third space dimension and plot p, q, 
and m respectively as the length, width, and height of rectangular 
solids, whose volumes then represent the product A. In addition the 


5 In a line graph one can show the smallest percentage changes as large as one pleases by simply 
using a large enough scale, and placing the zero level into the unseen depths below the diagram. In 
a profile, however, where the whole amounts from sero up are reprerented, percentage variations can- 
not be made to look large when in fact they are small. Hence in such situations, where a line graph may 
be seen to rush dramatically up and down, a profile would offer the drab sight of a sequence of practi- 
cally equal rectangles, giving, it is true, the correct information as to practically unchanging values, 
but unable to exhibit to the eye the small variations which do take place. 
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areas of the faces of the solid show the partial products p-q, q-m, 
and p-m. Of these the first, and frequently also the second are them- 
selves significant data, p-¢g=n as the total number of elements in the 
whole class, and g-m as the aggregate m-value per sub-class. 

This presentation is very compact, being capable of showing up to 
six significant data dimensionally integrated in a single figure. (See 
below Fig. 3.) On the other hand the disadvantages noted in the two- 
dimensional profiles pertain a fortiort to the three-dimensional ones, 
since the volumes of solids are still less readily compared than the 
areas of rectangle. Hence a separate presentation of the product A 
may often be desirable. 

To avoid these somewhat cumbersome solids one can represent three- 
dimensional relations in merely two space dimensions, by means of 
“ruled rectangles.” We first construct a rectangle whose width is the 
partial product n=p-qg. With m for its height its area represents the 
three-dimensional product A=p-q-m. Next we divide the width n 
into a number of segments equal or proportional to p, whereby each 
segment has a length proportional to n/p=g. At the points of division 
we rule the rectangle by parallel lines into strips, whose width is thus 
proportional to q. 

Such “ruled profiles” exhibit, properly integrated, all the significant 
data entering into three-dimensional relations. Although likely to be 
less accurate in the presentation of the factors p and gq, they have the 
advantage of being more easily read and interpreted than the more 
complex solid figures. 

Figures 1 and 2 show examples of profiles of two-dimensional rela- 
tions; Figure 3 of a solid, and Figure 4 of a ruled profile of three-di- 
mensional relations.® 

C. Composites.? The profile method appears to be of special advan- 
tage where the question is to represent not merely the data of a class 
(n, m, A) or (p, g, m, A) as a whole, but the distribution of these data 
over a number of component classes into which the whole class is di- 
vided. 


* Among the many other data which can advantageously be presented as two-dimensional profiles 
we mention: volume of crops ( =acreage Xbu/acre,; earnings of certain classes of wage earners ( = work- 
ing time Xwage rate); workers involved in labor board cases (=number of cases X workers Xcase); 
passenger-miles or ton-miles of traffic (number of passengers or tons Xaverage mileage). Of three- 
dimensional relations: annual man-hours in an industry ( =average employment Xdaily hours Xaverage 
working days in the year); annual earnings ( =daily hours Xwage rate Xaverage working days in the 
year.) Both these are of great interest in connection with the question of a guaranteed annual wage. 

7™The figures here described as “composites” are the only type with which Karsten (op. cit.) 
illustrates his “Vertical Area-Bar-Charts”; neither the two and three dimensional, nor the ruled time- 
series profiles are given or described. 
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FIGURE 4 


EMPLOYMENT AND EARNINGS IN TOBACCO MANUFACTURING 
UNITED STATES, 1899-1939 


Aggregate Wages = Number of Wage Earners Average Annual Earnings 
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Source: U. 8. Census of Manufactures, 1939, II, 1, p. 270 


Note: In this “ruled profile” several employment data pertaining to the tobacco industry are shown 
dimensionally integrated: total number of wage earners, (width), total wages (area); average annual 
earnings (height) number of establishments to nearest thovsand (number of internal strips); average 
numberof workers per establishment (width of strips). The profile thus shows at a glance the develop- 
ment from most poorly paid production in many very small establishments towards larger scale pro- 
duction in ever fewer establishments, employing in toto fewer workers at higher, but still very low 
average wages. 





The customary presentation of a distribution of data over com- 
ponent classes is either by way of compound bars; or by circular charts 
showing the share of each component as sectors of the full circle; or, 
where the time variations of the several components are of principal 
interes t, by way of compound line graphs showing the components 
added vertically onto each other. To present in such manner the trip- 
lets of data of a two-dimensional relationship distributed over k com- 
ponents one would need three separate k-fold compound line or bar 
graphs; or pairs of circular charts, for the presentation of the n and A 
data, combined with a bar graph for the m data. And at that these mul- 
tiple compound graphs would fail to show the dimensional interrela- 
tionships between the particular triplets (n;, m;, A;) belonging to one 
same component class. 

Under the dimensional scheme one combines such triplets for each 
component class into one rectangle, and places all k such component 
rectangles side by side. The figure so obtained consisting of k com- 
ponent rectangles all of which belong to one common time interval, 
we call “composites,” (in contrast to the “profiles” which consist of 
rectangles belonging to successive time intervals). 
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In composites the disadvantages of the dimensional presentation 
are far smaller than in the profiles, because they concern a distribution 
of values at a given time only, not their variation in time. On the other 
hand the advantage of showing all k triplets of data in their proper 
dimensional relationship in a single figure is very considerable. 

Finally one can construct “composite profiles,” that is, sequences 
of composites for successive time intervals, which are capable of 
giving a very compactly informative view of the changing aspects of 
complex phenomena.® 

The three figures below show illustrations of various types of com- 
posites: Figure 5 of a simple distribution of two-dimensional data; 
Figure 6 of a special kind, which we call “frequency composite”; 
Figure 7 of a sequence of “ruled” composites.® 





FIGURE 5 


RAILROAD TRAVEL, BY CLASSES OF PASSENGERS 
UNITED STATES, 1941, 1943 


Passenger-Miles = Passengers XAverage Mileage 
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Source: Statistics of Railways in the U. 8S. 1943, Tables 44, 52 


Note: This figure shows dsta for railroad traffic: number of passengers (width), average mileage 
(height), and passenger-miles (area) distributed over three classes of passengers. The several influences 
of wartime conditions are seen, dimensionally integrated: the increased number but constant mileage 
of the commuters; and the increase, both in number and mileage of actual travelers. 





® See “Strike Profiles,” loc. cit., figs. 2, 3, 4, 5. 

* Other data which can effectively be presented as composites are: volumes of crops (acreage 
Xbushel/acre) distributed over various geographical regions, states, or countries; values of crops 
(acreage Xvalue per acre) distributed over various crops; payrolls (workers Xaverage pay) distributed 
over different industries, or different classes of personnel; traffic data, as in Figure 5, distributed over 
various means of transportation (railroads, airplanes), and many others. 
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FIGURE 6 
CONSUMERS. INCOME, BY INCOME LEVELS, UNITED STATES 1935 


Aggregate Income =Consumer Units XAverage Income 
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Note: The special featuré in this “frequency composite” is that the component classes (income levels) 
are simply magnitude classes of the dimension % (average income) itself. Consequently the data con- 
stitute essentially a frequency distribution, the width (n;) showing how many elements have #7-values 
in the interval (m;,,—mi_,), (the heights of the two adjacent rectangles). However this composite 
shows additionally also for each income level the aggregate income (area A;), and the average income 
(height #,;), and thus gives more information than ordinary frequency graphs, simple and cumulative. 





D. Historical Profiles. Many apparently one-dimensional line or bar 
graphs of time series are really two-dimensional profiles, although of 
a different kind than those discussed above. 
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FIGURE 8 
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As an example let us take a line graph showing as its ordinate y a 
labor force varying in time ¢ plotted along the horizontal time axis, 
(Fig. 8). Let At, in days, be a time interval during which y is sufficiently 
constant. Then the product y-At=AA, which is the strip of area under 
the curve standing over At, gives the number of man-days of employ- 
ment. Hence the total area under the curve between two given dates t; 
and fe, 


ts 
A= Doy-At 
ty 
gives the total of man-days of employment during the time t.—h. 
The figure, therefore, is a profile showing three dimensionally related 
data: the number y of workers employed at any time ¢; the duration At 
of employment of y men, and the man-days of employment A. 

In other instances the height y of a line or bar graph represents, not 
as in the above case, a magnitude which endures in time, but a number 
of occurrences taking place and accumulating per time unit, that is, a 
time rate, AN /At (e.g. the number of births per year; or the number 
of strikes per month; or the number of workers quitting their jobs per 
month, or per week, etc.). In such cases the area under the curve 

ts te ts 
A= D>oy- At= > <—-at= AN = N 


ty th ty 


represents the total number of occurrences during the period 4:—h. 
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Hence again the graph is truly a profile representing three dimensionally 
related data: the time rate y, the time At over which the rate prevails, 
and the total number of occurrences during the time interval in ques- 
tion.'° 

Profiles of this kind, in which chronological time is represented by a 
time axis, we call “historical profiles,” in contrast to the statistical pro- 
files discussed above. 

A special form of historical profile results when the phenomenon 
represented is intermittent rather than continuous, that is, occurs at 
certain times, lasts for a while, and ceases again. The profile then con- 
sists of separate, quasi-rectangular figures along the time axis, their 
place showing the time of occurrence, their height the “magnitude,” 
their width the duration, and their area the “time integral” of each 
particular occurrence. 

Profiles of this kind are useful when individual occurrences rather 
than statistical summaries are important. For example, where indi- 
vidual strikes rather than overall strike statistics are of interest—as 
are, indeed, from the standpoint of management the strikes in its own 
establishments—“historical strike profiles” could be used to con- 
siderable advantage. Any one strike would be shown at its proper place 
along the time axis by plotting vertically the day-to-day number of 
workers involved, whereby the area would give the number of man- 
days idle. Such profiles show at a glance when, how often, and for how 
long how many men were out, and how many man-days were lost." 

On similar principles, only on a time scale ranging over centuries 
rather than months and years, one can construct very instructive “his- 
torical war profiles,” for individual nations or groups of nations, by 
using for the height the absolute or relative “number of men involved,” 
that is, the size of the armies, as a rough yardstick for the magnitude of 
wars (at least in the pre-atomic era), while the width of the rectangles 
shows the duration, and the areas the “man-years” of wars. 

10 These dimensional relations are familiar in the field of physics: Two examples analogous to the 
situations discussed above are: when a force F is plotted as a function of time ¢, the area under the 
curve, Re Fdt represents the Impulse; and when a velocity » =ds/dt is plotted as a function of time ?, 
the area ‘under the curve Ie vdt represents the total distance s:—s: traversed during the time interval 
. . To render such profiles more informative one could show further, say by distinctive coloring, 
the principal causes of the various strikes; through shaded extensions of the strike rectangles one could 


show the number of workers made indirectly idle through the strike. To allow the representation, in 
one same diagram, of large and of small strikes the vertical scale could be made logarithmic. 
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A METHOD FOR OBTAINING AND ANALYZING 
SENSITIVITY DATA* 


W. J. Drxon 
University of Oregon 
AND 


A. M. Moop 
Iowa State College 


The standard method of dealing with sensitivity of dosage- 
mortality data is the probit technique developed by Bliss and 
Fisher. This paper provides an alternative technique based on 
a special system for obtaining such data. It has some ad- 
vantages when observations must be taken on individuals 
rather than groups of individuals, and it may be preferred in 
certain other situations. 


INTRODUCTION 


XPERIMENTAL investigations often deal with continuous variables 

which cannot be measured in practice. For example, in testing the 
sensitivity of explosives to shock, a common procedure is to drop a 
weight on specimens of the same explosive mixture from various 
heights. There are heights at which some specimens will explode, and 
others will not, and it is assumed that those which will not explode would 
explode were the weight dropped from a sufficiently greater height. It 
is supposed, therefore, that there is a critical height associated with 
each specimen, and that the specimen will explode when the weight is 
dropped from a greater height and will not explode when the weight 
is dropped from a lesser height. The population of specimens is thus 
characterized by a continuous variable—the critical height—which 
cannot be measured. All one can do is select some height arbitrarily 
and determine whether the critical height for a given specimen is less 
than or greater than the selected height. 

This situation arises in many fields of research. Thus in testing insec- 
ticides, a critical dose is associated with each insect, but one cannot 
measure it. He can only try some dose and observe whether or not 
the insect is killed, that is, observe whether the critical dose for that 
insect is less than or greater than the chosen dose. The same difficulty 
arises in pharmaceutical research dealing with germicides, anesthetics, 

* This paper is in part an adaptation of a memorandum submitted to the Applied Mathematics 
Panel by the Statistical Research Group, Princeton University. The Statistical Research Group oper- 


ated under a contract with the Office of Scientific Research and Development, and was directed by the 
Applied Mathematics Panel of the National Defense Research Committee. 
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and other drugs; in testing strength of materials; in psycho-physical 
research dealing with threshold stimuli; and in several areas of biologi- 
cal and medical research. 

In true sensitivity experiments it is not possible to make more than 
one observation on a given specimen. Once a test has been made the 
specimen is altered (the explosive is packed; the insect is weakened) 
so that a bona fide result cannot be obtained from a second test. The 
common procedure in experiments of this kind is to divide the sample 
of specimens into several groups (usually but not necessarily of the 
same size) and to test one group at a chosen level, a second group at a 
second level, and so on. The data consist of the numbers affected and 
not affected at each level. A method of analyzing such data (variously 
called “sensitivity” data, “all or none” data, “quantal responses”) has 
been developed by Bliss and Fisher [references 1, 2], and discussed by 
other writers [3, 4, 5, 6]. 


THE “UP AND DOWN” METHOD 


A new technique for obtaining sensitivity data has been developed 
and used in explosives research. The authors became acquainted with 
this new method in 1943 at the Explosives Research Laboratory, Bruce- 
ton, Pennsylvania. It has come to be called the “up and down” method. 
The method may be employed in any sensitivity experiment, but we 
shall discuss it in terms of the explosives to avoid general terminology. 

The technique is to choose some initial height ho, and a succession 
of heights fy, he, hs, - + + above ho together with a succession h_, h_2, 
h_s, +++ below ho. The first specimen is tested by dropping the weight 
from height ho. If the first specimen explodes, the second specimen will 
be tested at h_;, otherwise the second specimen will be tested at hi. 
In general, any specimen will be tested at the level immediately below 
or immediately above the level of the previous test according as there 
was or was not an explosion on the previous test. The result of such an 
experiment might be portrayed as in Figure 1 where the z’s represent 
explosions and the o’s non-explosions. The first test is on the left at the 
highest level; this was a success (explosion) so the second test was made 
at the next lower level and was also a success; the third test was there- 
fore made at the level below that of the second and since it was a failure 
the fourth test was made at the level above that of the third test. 

The primary advantage of this method is that it automatically con- 
centrates testing near the mean. We shall see later that this increases 
the accuracy with which the mean can be estimated. Or in other words, 
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RECORD OF A SAMPLE OF SIXTY TESTS 
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FIGURE 1 


for a given accuracy the up and down method will require fewer tests 
than the ordinary method of testing groups of equal size at preassigned 
heights. The saving in the number of observations may be of the order 
of 30 to 40 per cent (see Appendix A). . 

Another advantage is that the statistical analysis is quite simple in 
certain circumstances whereas the analysis for the ordinary method is 
rather tedious. 

The method has one obvious disadvantage in certain kinds of experi- 
ments because it requires that each specimen be tested separately. This 
is not important in explosives experiments because each test must be 
made separately anyway. But in tests of insecticides, for example, a 
large group of insects can sometimes be treated as easily as a single 
one, and in large experiments of this kind any advantage of the up and 
down method might well be outweighed by this requirement of single 
tests. Even here, if expensive laboratory animals were being used, the 
advantage in economy of tests might offset the trouble of making single 
tests. 


CONDITIONS ON THE EXPERIMENT 


The statistical analysis of data obtained can be quite simple provided 
the experiment satisfies certain conditions. Less restrictive conditions 
must be fulfilled in order that any analysis will be possible. These will 
be discussed here and the actual analysis will be given in the following 
section. 

In the first place, the analysis requires that the variate under 
analysis be normally distributed. In practice the variate of interest to 
the research worker can rarely be considered to be normally distributed. 
It is therefore necessary that the natural variate be transformed to 
one which does have the normal distribution. This is readily done pro- 
vided the research worker has enough experience and data on his ma- 
terial to be able to specify rather accurately the shape of his distribution 
function. It is often the case in dosage mortality experiments and in 











112 AMERICAN STATISTICAL ASSOCIATION 


experiments on explosives that the logarithm of the dosage concentra- 
tion or of the height is reasonably normally distributed. But in other 
areas of research, and sometimes in these areas, other transformations 
are more appropriate [7]. 

If one has no idea of the shape of his distribution function then the 
data of the experiment itself must be used to provide this information. 
The common procedure here is to compute the percentage affected at 
each level and plot these percentages on arithmetic probability paper 
against various functions of the variate in question. Usually one can 
soon discover what sort of function will force the percentages to lie 
sensibly along a straight line. There are, of course, infinitely many 
functions to choose from; the chosen function should be as simple as 
possible consistent with whatever knowledge is available concerning the 
nature of the material at hand. 

We have already mentioned that the up and down method is par- 
ticularly effective for estimating the mean. It is not a good method for 
estimating small or large percentage points (fer example, the height at 
which 99 per cent of specimens explode) unless normality of the dis- 
tribution is assured. In fact no method which uses the normal distribu- 
tion can be relied on to estimate extreme percentage points because 
such estimates depend critically on the assumption of normality. In 
most experimental research, it is possible to find simple transformations 
which make the variate essentially normal in the region of the mean, 
but to make it normal in the tails is quite another matter. Nothing 
short of an extensive exploration of the distribution involving perhaps 
thousands of observations will suffice here. Bartlett [8] has recently 
presented an interesting technique for dealing with this problem. 

A second condition on the experiment is that the sample size must be 
large if the analysis to be described is to be applicable. As it turns out, 
the effective sample size is only about half the actual sample size. The 
statistical analysis is based on large sample theory so that if one uses 
the analysis on a sample of size forty, he will in effect be using Jarge 
sample theory on a sample of size twenty. Measures of reliability 
may well be very misleading if the sample size is less than forty or 
fifty. 

A further condition is necessary if the statistical analysis is to be 
simple. One must be able to estimate roughly in advance the standard 
deviation of the normally distributed transformed variate. The inter- 
val between testing levels should be approximately equal to the stand- 
ard deviation. This condition will be well enough satisfied if the inter- 
val actually used is less than twice the standard deviation. This require- 
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ment is not severe, for research workers who repeatedly perform these 
requirements on essentially similar materials can usually make very 
good preliminary estimates. This is the case in explosives research or 
biological assay, for example. This circumstance (of repeated experi- 
ments) is precisely the one in which a simple analysis is most desirable. 


STATISTICAL ANALYSIS 


The simple method of analysis given in this section is applicable only 
when all the conditions described in the preceding section are fulfilled. 
The theory underlying the method is given in Appendix A. The more 
complex analysis required when the levels are not equally spaced or 
when the distance between levels exceeds twice the standard deviation 
is given in Appendix B. 

We again revert to the explosives experiment in describing the meth- 
od. Suppose it is known for the given type of explosive that the log- 
arithms of the critical heights are normally distributed. Letting h 
represent the height, y= log h will then be the normally distributed 
variate. We shall call y the normalized height, and represent the mean 
and variance of its distribution by » and o?. The experiment is per- 
formed by choosing an initial height for the first test, say ho. This 
should be chosen near the anticipated mean. The other testing levels 
are determined so that the values of the normalized height y are equally 
spaced. If d is the preliminary estimate of o, and if yo= log ho, then the 
actual testing heights are obtained by putting log h=yotd, yo+2d, 
yo+3d, ---, and solving for h. The heights will then be so spaced 
that the transformed variate is equally spaced with spacing equal to 
its anticipated standard deviation. All computations are done in terms 
of y. 

In any experiment the total number of successes will be approxi- 
mate_y equal to the total number of failures. In fact, the number of 
failures at any level cannot differ by more than one from the number of 
successes at the next higher level. For estimating u and o only the suc- 
cesses or only the failures are used, depending on which has the smaller 
total. In the example shown in Figure 1 there are fewer failures than 
successes so the failures would be used. We shall let N denote the 
smailer total and let mo, m1, M2, - - - mx denote the frequencies at each 
leve! for this less frequent event where mo corresponds to the lowest level 
and nm the highest level on which the event occurs. We have then 
rn; => N e 

The estimates of » and o are based on the first two moments of the 
y values using the frequencies n;. But since the y values are equally 
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spaced, the moments are more easily computed in terms of the two 
sums 


A= » in; 
B= Di. 


In this notation, the estimate of pu, say m, is 


‘+d (5 + ; ) (1) 
m = —+— 
, N- 2 

where y’ is the normalized height corresponding to the lowest level on 
which the less frequent event occurs. The plus sign is used when the 
analysis is based on the failures, and the minus sign when it is based 
on the successes. 

The sample standard deviation is 


N 2 
s = 1.620d an + 029) (2) 
and this, of course, is the estimate of o. This is a curious estimate in that 
while it is a linear function of (VB—A’)/N?, it gives the estimate of 
the standard deviation, not the square of the standard deviation. The 
formula is an approximate one which is quite accurate when (NB 
— A?)/N? is larger than 0.3 but breaks down rapidly when (VN B— A?)/N? 
becomes less than 0.3. In the latter instance the formula cannot be 
used and the more elaborate calculation described in Appendix B must 
be employed. 

The example of Figure 1 will illustrate the use of the formulas. Here 
the y values used were 2, 1.7, 1.4, 1.1, 0.8; the level of the first test yo 
being 2, and d being 0.3. Among the sixty tests there were 31 explosions 
and 29 failures, hence the latter are used to estimate the parameters. 
The failures appear on three levels (0.8, 1.1, 1.4) with frequencies no 
=2, m.=18, nz==9. We have then N=29, A=36, B=54, so that the 
mean is 


08 +03(= + -) 1.32 
m = 0. 2o(—+—)=1. 
». 3 

and the standard deviation is 


(1.620) (0.3) (=. + 029) 17 
s = (1. Si +. = .17. 
841 
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The sample was actually drawn from a normal population with 4. =1.3 
and ¢=0.2 using Mahalanobis’ [9] table of random normal deviates. 
The mean and standard deviation of the sixty observations were 1.312 
and .158 so that it was a fairly representative sample. 

Percentage points would be estimated by m-+-ks where k is chosen 
from tables of the normal deviate to give the desired percentage. Thus 
in the example, the 5 per cent point is estimated by 


1.32 — (1.645)(.17) = 1.04. 


If the y values are theught of as logarithms to base ten of actual 
heights in inches in an explosives experiment, the antilogarithms of 
estimated percentage points would be estimates of the corresponding 
points for the distribution of h. Thus the median (not mean) value of 
h is estimated by 


antilog 1.32 = 20.9 inches 
and the 5 per cent height by 
antilog 1.04 = 11 inches. 


The antilogarithm of s does not estimate the standard deviation for 
h, however, and any computation which involves the standard devia- 
tion (estimates of percentage points, confidence limits) must be done 
in terms of the normalized height, and only the final result transformed 
to actual heights. 


CONFIDENCE INTERVALS 


Ordinarily the standard deviation of a sample mean, m, is given by 
om=a/./N where o is the population standard deviation and N the 
sample size. In the present case tl.is expression must be multiplied by 
a factor which we shall call G, so that the formula for the standard 
error of the mean is 


om = Ga/\/N (3) 


and G depends on the ratio d/o and on the position of the mean relative 
to the testing levels. G is plotted in Figure 2 as a function of d/c. 
The position of the mean relative to the testing levels does not affect 
G unless the interval d is large; the solid branch of the curve gives the 
value of G when the mean falls on one of the testing levels, while the 
dashed branch gives the value when the mean falls midway between 
two levels. Curves for other positions of the mean would fall between 
the two branches. 
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In practice o is not known and s must be used in (3) to obtain an 
estimate say, 8m, Of om. In the illustrative example with s=.17, we have 
d/s=1.8 so that G is about 1.12. The estimate of co» is therefore 


8m = (.17)(1.12)/+/29 = .035. 


A confidence interval for m may now be estimated by m+ks,. Thus a 
95 per cent confidence interval is 


1.32 + (1.96)(.035) or 1.25 to 1.39 


using large sample theory. For moderate values of N, it might be 
preferable to use the value of k given by the ¢ distribution for N—1 
degrees of freedom, but it is likely that this is a minor matter relative 
to the error of using large sample theory for moderate values of N. 
Again assuming the confidence interval refers to the logarithm of an 
actual height, it gives rise to an asymmetric 95 per cent confidence 
interval] (18 to 25 inches) for the median height. 

The standard error of the sample standard deviation, say o,, is or- 
dinarily given by o/+/2N, but in the present analysis an additional 
factor is again required. We shall write 


= Ho//N (4) 


where we have incorporated the 1/1/2 into the extra factor. H is 


a Sa 
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plotted in Figure 2 where the solid branch gives the value of H when the 
mean falls on a level, while the dashed branch gives the value when the 
mean is midway between two levels. When d/c is less than two there 
will be little error introduced by interpolating linearly between the 
two branches for other positions of the mean. Thus if the mean falls 
d/4 from a testing level, one may use the value of G midway between 
the two branches. For the illustrative example with d/s=1.8, we find 
H to be about 1.24 so that the estimate of o, is 


8, = (1.24)(.17)/+/29 = .039. 


The estimate s, would be used to estimate the standard error of a per- 
centage point, m+ks; the estimate would be v/s,,?+-k?s,?. Thus in the 
example, a 95 per cent confidence interval for the 5 per cent point 
would be estimated by 1.04 + (1.96)+/(0.35)?+-(1.645)?(.039)? or .88 to 
1.20. We should mention again that the estimation of small or large 
percentage points depends strongly on the assumption of normality in 
the tails. It can easily happen that a relatively small error in this as- 
sumption may far outweigh the sampling error indicated by the con- 
fidence interval, especially in the case of very extreme percentages, say 
1 per cent or 0.1 per cent. 





CHOICE OF TESTING INTERVAL 


The curves in Figure 2 have been extended beyond d=2e in order 
to show what happens to the measures of precision for larger intervals. 
Curve G shows that the precision of the mean steadily decreases as d 
increases. The two branches of H show that there is an optimum spacing 
for estimating the standard deviation depending on the position of the 
mean relative to the testing levels. Since the mean is usually unknown, 
this information is of little practical valve. 

Curve G indicates that the interval should be quite small for maxi- 
mum precision in the mean, but in practice this is not true for several 
reasons. In the first place the curves are for expected values and essen- 
tially assume infinite sample sizes, and in fact very large samples are 
required to get good estimates of the mean for a very small interval. 
The estimate may be biased appreciably toward the initial testing level 
unless the sample is very large. Secondly, a small interval may cause 
one to waste observations unless a good choice for the initial level is 
made. If a poor choice is made, many observations must be spent get- 
ting from that level to the region of the mean. And finally, since c is 
usually unknown, the precision of the mean must actually be measured 
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by 8, and the accuracy of s becomes poor for very small intervals as 
shown by curve H. 

All these considerations indicate that the interva] should be within 
the range of about 0.50 to 2c, and experiments with the method support 
this conclusion. 


APPENDIX A 


If y is normally distributed with mean yu and variance o?, and if tests 
are made at 


¥i = Yo + td += 0,1,2,--- (5) 


where yo is the level of the initial test, then there will be, say, n; suc- 
cesses and m failures at y;, and the distribution of these latter variates 
is 


P(n, m| yo) = K II pirqim (6) 
fma—oo - 
where 
“ 1 u1y2tw? ; 
;= ——e ge ae ale 7 
p a a q (7) 


and where K is not a function of » and o?. 

The estimation of » and o? is based on the principle of maximum like- 
lihood. We shal! not maximize (6) directly, however, because a material 
simplification in the analysis can be made by neglecting a small part of 
the information in the sample. It is clear that 


|n:—mua] =0 or 1 


so that either one of the sets (n,;) or (m;) contain practically all the 
information in the sample. If N=<Zn; and M=Zm,, and assuming 
N SM, we may write (2) in the form 


P(n, m| yo, M — N) = K’J] (pigia)™* (8) 


and this is the expression which will be maximized. Even if M—N is 
not small, only a small amount of information is being neglected, be- 
cause in this instance the initia! level will have been poorly chosen and 
these neglected observations will have been spent in getting from y 
to the region of the mean; they will obviously contribute little to the 
more precise location of the mean. 
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On putting the derivatives of (8) with respect to » and o equal to 
zero we have the relations 


Dn (= -=)-0 (9) 





Qi-1 Di 

2i-125— Viz 
on( : *- =) 0 (10) 

di-1 Di 


where z; represents the ordinate of the distribution of y at y; and 
z;=(y:—u)/o. The expected values of the left hand sides of these 
two expressions are readily found to be zero on substituting E(n,) for 
n; E(n;) may be determined from the relation 


E(ni+1) _ E(n) . 





(11) 
qi Di 
If we let 
Wo = 1 
i—1 qi 
w;= = 1 > 0 
j=0 Dj 
eft B ¢<s 
j=l 3 
then it follows that 
E(n:) = Nw, / > wi. (12) 


The maximum likelihood estimates of » and a are the roots, say f 
and é, of equations (9) and (10). While there is no simple closed ex- 
pression for these roots, it turns out that they can be very closely 
approximated when d<2c¢. The function 


2(z) a 2(z + d/c) 
q(x) p(x +d/c) 


where u=2z+d/2e, is nearly linear in u when d<2c. This is illustrated 
in Figure 3. Similarly 





a(u) = 


xz(z) . (x + d/a)z(x + d/o) 
q(z) p(x + d/o) 


is nearly quadratic in u as is indicated by the graph of its first deriva- 
tive in Figure 3, where r=d/o. We may conclude therefore that the 
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estimates are essentially determined by the first two moments of the 
y; using the n,; as weights. 


If we let 
1 1 
hy = w LD ny:, b= V D ny? (13) 
we find 
E(ti) = » — d/2 (14) 
and 
E(f2) — E*(p;) + d?/4 = pee (15) 


The expression on the right of (15) is nearly linear in o when d<2e, 
and its linear approximation was used to determine the estimate of « 
given in equation (2). The function is plotted in Figure 4; the solid 
branch represents the function when the mean falls at one of the y, 
and the dashed branch when the mean falls midway between two levels. 
The two branches diverge rapidly as d becomes larger than 2c. 

The variance and covariances of ;, and ¢ are determined from the 
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second derivatives of L=log P where P is defined by (8). The ex- 
pected values of the derivatives are readily found to be 


#(—) = “le Ew (= +=) /Du-= — 


52 @ q?s-1 pi? o2G? 


eL N Xj-127;_ x32;? 

B( )- = Xu alan )/ rm (17) 
dude o* qQ*i-1 pi? 
aL N x? ;_127 54 2;72;? 

B( a x wi( q*i-1 . Dp? )/ xm 


N 
~  gtHt 
Expression (17) does not vanish unless the mean falls on a level or 
midway between two levels. However, we have regarded the covariance 
as being negligible for all practical purposes. It gives rise to a maxi- 


mum correlation between 4 and ¢ of the order of .0002 when d=c, and 
.02 when d=2c. We have then 


o,? = G2o?/N, oe? = H*o?/N (19) 


approximately, where G and H are defined in (16) and (18). These are 
the functions plotted in Figure 2. 


























(18) 


FIGURE 4 
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It is not possible to make a very satisfactory comparison between 
this method and the ordinary probit method, but the following compu- 
tations provide some indication of the relative efficiencies. Suppose 2N 
individuals are divided into five equal groups and tested at y=0, 
+o, +2c. Bartlett [8], for example, shows that the variance of 4 for 
the probit analysis is about 5(.564)o?/2N, whereas for the up and down 
method the variance is o?/N which when divided by the former value 
gives 71 per cent. When the sample is tested in six equal groups at 
y= +40, +30, +c, the ratio becomes 58 per cent. But these compari- 
sons are not fair unless there is considerable uncertainty as to the gen- 
eral location of the mean. If the mean can be located to within, say, ¢ 
of its true position in advance of the experiment, then the efficiency of 
the probit method can be much improved by using groups of unequal 
size and testing the larger groups at levels thought to be near the mean. 


APPENDIX B 


When the chosen testing interval is larger than 2c, or when the inter- 
vals are of unequal size, it is necessary to solve equations (9) and (10) 
for » and o. The intervals will be of unequal size, for example, when 
the normalizing transformation is unknown in advance of the experi- 
ment and must be deduced from the results of the experiment itself. 
A method of trial and error is probably as good as any other for solving 
the equations. One would first choose preliminary estimates, say m and 
s, of the roots by using equations (1) and (2) or simply by using guessed 
values. These preliminary estimates would be adjusted until the equa- 
tions were satisfied to the desired degree of approximation. The left 
side of (9) will be positive when the trial value of » is too small, and 
negative when it is too large. The left side of (10) will be positive when 
s<é, and negative when s>é. The equation (9) is relatively insensi- 
tive to changes in s, while the same is true of (10) for changes in m. 

In order to facilitate the computations, the accompanying tables of 
z/p (Table I) and z/q (Table II) are provided. For negative values of 
z, p and y are interchanged, that is 


z(z) 2(—2) 

p(x) 4(—2) 
We shall illustrate the computation using the data of Figure 5. The 
normalized heights are .1, .9, 1.5, 1.9 as indicated in the Figure. We 


shall number the levels 0, 1, 2, 3 beginning with the lowest level. 
Since there are more successes than failures, the latter are used to deter- 














OBTAINING AND ANALYZING SENSITIVITY DATA 123 


mine the estimates. A preliminary estimate of » may be obtained by 
using the average of the midpoints of the intervals weighted by the 
numbers 7;; thus we shall put 
m, = 1/29[2(1.7) + 26(1.2) + (.5)] 
= 1.2. 
A rough estimate of « may be determined by observing that the inter- 


val 0.9 to 1.5 appears to contain 26/29 or about 90 per cent of the dis- 
tribution, hence we may use 


1.64538, > 3(1.5 — 0.9) = 





= 0.18. 
RECORD OF A SAMPLE OF SIXTY TESTS 
Normalized Number of 
Height z's o's 
1.9 x x x 3 
1.5 SSSALETESSSSLSOSZSESSEESA 228288068890 222 27 2 
9 ©CO000C0000 CO000XOX0000000 OO 1 26 
ok ° 1 
FIGURE 5 





In adjusting these estimates one might be tempted to adjust m, first 
by equation (9) and then go to equation (10) and adjust s; by using a 
good estimate of yu. It turns out, however, that the job can be done 
much more rapidly by considering both equations together. The fol- 
lowing computational form may be used: 


zj-1 Xj Tit, Tz T-%j-1 Wey 
i ni hy um “| —-— — a ni -—— 
Q—1 Pi a Mm Qi-1 PME 
































3 | 2 | 1.9 | 3.89 4.17 0 6.96 

2 | 26 | 1.5 | 1.67 00 3.48 174 ~9.05 

at 3 9 |—1.67 —2.08 —.174 | —3.48 3.48 
0 1 |-6.11 0 

| | | | 2.09 | | 1.39 


Note that the table is arranged so that the frequencies of either the 
zeros or x’s will be entered in the table as though they were z’s. The 
symbol zx; represents (h;—m,)/s,; where h,; is the height and m and s; 
are the first approximations to » and o. The other computations are 
defined by the column headings. Thus the figure 4.17 at the top of the 
fifth column is obtained as 2(2.084—.000) ; 2.084 being read from Table 
II at x =1.67, and .000 being the value of z/p at x =3.89 as shown by 
Table I. The sums, 2.09 and 1.39, of the fifth and eighth columns give 
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the values of the left hand sides of equations (9) and (10) respectively; 
since both sums are positive. We conclude that both mj and s; are too 
small. Using m.=1.3 and s.=.19 we repeat the above calculation: 


Hi. % Tits Xin a 
‘ ni hy ae ing — — | m% — 
G1 PB @ Pi @i-1 Pi 








3.16 3.12 -O1 3.28 
‘ —5.85 1.64 +282 —9.75 
—2.11 —2.47 -- .093 —5.21 §.21 
—6 .32 0 
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TABLE I 
VALUES OF s/p 
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These results show that the roots are bracketed, and good estimates of 
p and o may be obtained by interpolation between the sum:. Inter- 
polating between 1.2 and 1.3 using 2.09, 0, —5.20, we find ms=1.23, 
and similarly s;=.185. By doing two more calculations similar to the 
two illustrated above, one would verify the third figures in m and s 
and obtain good estimates for the fourth figures. Here, the results 
to three figures are m=1.21 and s=.187. However, the data do not 


TABLE II 
VALUES OF #/g 











* | 00 | o1 | o2 | .03 | .04 | 05 | 06 | 07 | 08 | 09 
0.0 0.798 | 0.804 | 0.811 | 0.817 | 0.824 | 0.830 | 0.836 | 0.843 | 0.849 | 0.856 
0.1 0.863 | 0.869 | 0.876 | 0.882 | 0.889 | 0.896 | 0.902 | 0.909 | 0.916 | 0.923 
0.2 0.929 | 0.936 | 0.943 | 0.950 | 0.957 | 0.964 | 0.970 | 0.977 | 0.984 | 0.991 
0.3 0.998 | 1.005 | 1.012 | 1.019 | 1.026 | 1.033 | 1.040 | 1.047 | 1.054 | 1.062 
0.4 1.069 | 1.076 | 1.083 | 1.090 | 1.097 | 1.105 | 1.112 | 1.119 | 1.126 | 1.134 
0.5 1.141 | 1.148 | 1.156 | 1.163 | 1.171 | 1.178 | 1.185 | 1.193 | 1.200 | 1.207 
0.6 1.215 | 1.222 | 1.230 | 1.237 | 1.245 | 1.253 | 1.260 | 1.268 | 1.275 | 1.283 
0.7 1.290 | 1.298 | 1.306 | 1.313 | 1.321 | 1.329 | 1.336 | 1.344 | 1.352 | 1.360 
0.8 1.367 | 1.375 | 1.383 | 1.391 | 1.399 | 1.406 | 1.414 | 1.422 | 1.430 | 1.438 
0.9 1.446 | 1.454 | 1.461 | 1.469 | 1.477 | 1.485 | 1.493 | 1.501 | 1.509 | 1.517 
1.0 1.525 | 1.533 | 1.541 | 1.549 | 1.557 | 1.565 | 1.573 | 1.581 | 1.590 | 1.598 
1.1 1.606 | 1.614 | 1.622 | 1.630 | 1.638 | 1.646 | 1.655 | 1.663 | 1.671 | 1.679 
1.2 1.687 | 1.696 | 1.704 | 1.712 | 1.720 | 1.729 | 1.737 | 1.745 | 1.754 | 1.762 
1.3 1.770 | 1.779 | 1.787 | 1.795 | 1.804 | 1.812 | 1.820 | 1.829 | 1.838 | 1.846 
1.4 1.854 | 1.862 | 1.871 | 1.879 | 1.888 | 1.896 | 1.905 | 1.913 | 1.922 | 1.930 
1.5 1.938 | 1.947 | 1.955 | 1.964 | 1.972 | 1.981 | 1.990 | 1.998 | 2.007 | 2.015 
1.6 2.024 | 2.033 | 2.041 | 2.050 | 2.058 | 2.067 | 2.076 | 2.084 | 2.093 | 2.102 
1.7 2.110 | 2.119 | 2.128 | 2.136 | 2.145 | 2.154 | 2.162 | 2.171 | 2.180 | 2.188 
1.8 2.197 | 2.206 | 2.215 | 2.223 | 2.232 | 2.241 | 2.250 | 2.258 | 2.267 | 2.276 
1.9 2.285 | 2.294 | 2.303 | 2.311 | 2.320 | 2.329 | 2.338 | 2.346 | 2.355 | 2.364 
2.0 2.373 | 2.381 | 2.390 | 2.399 | 2.408 | 2.417 | 2.426 | 2.435 | 2.444 | 2.453 
2.1 2.462 | 2.470 | 2.479 | 2.488 | 2.497 | 2.506 | 2.515 | 2.524 | 2.533 | 2.542 
2.2 2.551 | 2.560 | 2.569-| 2.578 | 2.587 | 2.596 | 2.605 | 2.614 | 2.623 | 2.632 
2.3 2.641 | 2.650 | 2.659 | 2.668 | 2.677 | 2.687 | 2.696 | 2.705 | 2.714 | 2.723 
2.4 2.732 | 2.741 | 2.750 | 2.759 | 2.768 | 2.777 | 2.786 | 2.795 | 2.805 | 2.814 
2.5 2.823 | 2.832 | 2.841 | 2.850 | 2.859 | 2.868 | 2.878 | 2.887 | 2.896 | 2.905 
2.6 2.914 | 2.923 | 2.932 | 2.942 | 2.951 | 2.960 | 2.969 | 2.978 | 2.987 | 2.997 
2.7 3.006 | 3.015 | 3.024 | 3.033 | 3.043 | 3.052 | 3.061 | 3.070 | 3.079 | 3.089 
2.8 3.098 | 3.107 | 3.116 | 3.126 | 3.135 | 3.144 | 3.153 | 3.163 | 3.172 | 3.181 
2.9 3.190 | 3.200 | 3.209 | 3.218 | 3.227 | 3.237 | 3.246 | 3.255 | 3.265 | 3.274 
3.0 3.283 | 3.292 | 3.302 |-3.311 | 3.320 | 3.330 | 3.339 | 3.348 | 3.358 | 3.367 
3.1 3.376 | 3.386 | 3.395 | 3.404 | 3.413 | 3.423 | 3.432 | 3.441 | 3.451 | 3.460 
3.2 3.470 | 3.479 | 3.488 | 3.498 | 3.507 | 3.516 | 3.526 | 3.535 | 3.544 | 3.554 
3.3 3.563 | 3.573 | 3.582 | 3.591 | 3.601 | 3.610 | 3.620 | 3.629 | 3.638 | 3.648 
3.4 3.657 | 3.667 | 3.676 | 3.685 | 3.695 | 3.704 | 3.714 | 3.723 | 3.732 | 3.742 
3.5 3.751 | 3.761 | 3.770 | 3.780 | 3.789 | 3.799 | 3.808 | 3.817 | 3.827 | 3.836 
3.6 3.846 | 3.855 | 3.865 | 3.874 | 3.884 | 3.893 | 3.902 | 3.912 | 3.992 | 3.931 
3.7 3.940 | 3.950 | 3.959 | 3.969 | 3.978 | 3.988 | 3.997 | 4.007 | 4.016 | 4.026 
3.8 4.035 | 4.045 | 4.054 | 4.064 | 4.073 | 4.083 | 4.092 | 4.102 | 4.111 | 4.121 
3.9 4.130 | 4.140 | 4.149 | 4.159 | 4.169 | 4.178 | 4.188 | 4.197 | 4.206 | 4.216 
4.0 4.226 | 4.235 | 4.245 | 4.254 | 4.264 | 4.273 | 4.283 | 4.292 | 4.302 | 4.312 
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warrant any more accuracy in the roots than is given by ms; and 33, 
and one would not do the two extra computations. The results in Figure 
5 were obtained by using the same set of observations (with mean 1.312 
and standard deviation .158) as was used to obtain the results of 


Figure 1. 
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A £HORT-CUT METHOD OF FITTING 
A LOGISTIC CURVE 


Wiuiram A. Spurr AND Davin R. ARNOLD 
Stanford University 


“Growth” curves have been used in many fields despite the 
extensive calculations required. These curves may be applied 
even more widely if their construction can be simplified. This 
article shows how to fit a logistic (Pearl-Reed) curve quite 
simply by means of a nomograph (which determines the upper 
limit from three selected points) and a logistic grid (which re- 
duces the curve itself to a straight line).! Other “growth” 
curves may be fitted by constructing appropriate graphs. 


THE METHOD 


HE FIRST STEP in fitting a logistic curve to a time series by any 

method is to determine whether the process represented has the char- 
acteristics of population growth that will justify the use of the Pearl- 
Reed equation as a logical approximation.? The next step is to plot the 
complete series on semi-logarithmic graph paper to determine empiri- 
cally whether the rate of growth is steadily declining. Then three points, 
sometimes geometric averages of several years, may be selected, 
representing typical levels of the early, middle, and recent periods. The 
points must be equidistant in time. 

The nomograph (Figure 1) is then used to determine the upper limit 
k of the logistic curve from three selected points Yo, Y:1, and Y2 as 
follows: Compute Yi/Yo and Y2/Y; by slide rule;’ place a transparent 
rwer across these values on the left and center scales, and read off 
k/Y o from the right hand scale. Multiply by Yo to find k. 

Once the upper limit k is found, the Pearl-Reed curve may be deter- 
mined readily from the logistic grid (Figure 2) which reduces this curve 
to a straight line. Divide the three selected points by k and plot these 
as percentages on the grid, using any convenient time scale on the 
X axis. Draw a straight line through the points‘ and read the values of 
this line at intervals of time, including the future, if a forecast is de- 
sired. Multiply these values by k, plot on the original semi-logarithmic 
chart and connect by a smooth curve. This is the logistic curve. 

1 The writers are indebted to Professors Albert Bowker and Alfred Niles of Stanford University 
for suggestions on this paper. 
2 See R. Pearl and L. J. Reed, The Biology of Population Growth, Knopf, 1925. 


3 Y;/Y: must be less than Y:/¥, for this type of curve. 
4 If the three points do not fall on a straight line, an arithmetic error has been made. 
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These graphs may be used in any of four ways: 

(1) to obviate all mathematical calculations except for several 
slide-rule ratios; 

(2) to test any three selected points for goodness of fit. If the 
graphic logistic curve fits the data poorly, other combinations 
of three points may be tried. Then k may be computed mathe- 
matically for the points finally chosen; 

(3) to serve as a quick check on the accuracy of machine calcula- 
tion, since any error is likely both to distort k considerably and 
to destroy the linearity of the three points on the logistic 
graph; 

(4) to make a further test of the validity of the logistic equation 
itself if desired. The raw data multiplied by 1/k must be roughly 
linear when plotted on the logistic grid to justify the use of 
this equation. 

The following procedure is recommended: Use the nomograph for 
purposes (2) and (3) above, but also compute.1/k mathematically, since 
the nomograph is fairly approximate. Then fit the actual curve by 
means of the logistic grid alone, since this is probably quite accurate 
enough. Enlarge Figures 1 and 2 by photostat to a convenient size, and 
use tracing paper over the latter. 


DISCUSSION 


The logistic, Gompertz, and other S-shaped curves representing 
retardation of rate of growth have come into wide use since 1920 as 
means of measuring and projecting biological growth generally, and 
secular trends of population and industries in particular. The simple 
logistic or Pearl-Reed curve fitted to three selected points is probably 
the most widely used of these curves,’ so it has been selected for solu- 
tion here. k 


The logistic curve may be written: Y «« —-.. 
1 + eorex 


where the curve ordinate Y approaches k the upper limit as time, X 
increases; ¢ is the natural logarithm base 2.71828, and a and b are con- 
stants, b being negative. 

Even this simple form of the growth curve, however, requires the 
lengthy calculation of the three constants, k, a, and 5, a procedure 
that must be repeated for each set of three points that are tested for 

5 See Simon Kusnets, Secular Movements in Production and Prices, pp. 59-68, 197-199, Houghton 


Mifflin, 1930, or a recent popular description in E. R. Dewey and E. F. Dakin, Cycles, Holt, 1947. 
Ch. I-IV. 
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goodness of fit*. Moreover, despite the precision of mathematical cal- 
culation, the results are only approximate, since (1) the equation itself 
is merely one of several empirical approaches to the “law of growth”, 
(2) the selection of three points is subjective, and (3) errors of any kind 
are exaggerated in extrapolating to find the upper limit. 

The graphic method obviates the following calculations: 


— 2YoViY2 — Yi2(Yo + Y2) 





Yo¥2 — Yi? 
k— Yo 
a = log. — 
0 
1 (= — Y:)\ 
b = — log. | —————— 
Yi(k — Yo)/ 
k (This must be calculated for each 
Y = ————_ point of the fitted curve.) 
1 + etek 


The errors of graphic measurement are believed to be small compared 
with the errors implicit in the Pearl-Reed equation itself and the 
selected points as described above. 


THE NOMOGRAPH 


The nomograph for finding the upper limit was constructed by 
reducing the logistic equation to: 


6B — 0A —2AB+A°B+A=0 


where 0= k/Yo, A= Yi/Yo and B= Y2/Yi. 

This equation was expressed as a third-order determinant which was 
multiplied by a matrix of transformation to produce a nomographic 
determinant.’ 

The equations for the coordinates of the nomographic scales (in 
arbitrary linear units) were then found to be: 


* See F. E. Croxton and D. J. Cowden, Applied General Statistics, pp. 452-458, Prentice-Hall, 1939° 
The methods of least squares and moments are much more cumbersome than that described. 

The present method appears to be shorter than that recently suggested by Dudley J. Cowden, 
(Journal of the American Statistical Association, Dec. 1947, pp. 585-590), since: (1) the trial values of k 
may be determined from a nomograph, rather than computed; (2) the logistic grid straigbtens out the 
logistic curve itself, whereas Professor Cowden uses a semi-logarithmic grid to produce a linear trans- 

1 


1 
formation of the function 7 - > which requires five extra columns of computations or scale readings 
c 


(p. 589). The method described here should also be shorter than that of Raymond Pear] (Introduction 
to Medical Biometry and Statistics, Third Edition, Chap. XVIII, W. B. Saunders Co., Philadelphia, 
1940) for similar reasons. 

. '™See F.T. Mavis, The Construction of Nomographie Charts, pp. 56-57, Scranton, Pennsylvania: 
Tnternational Textbook Co., 1939. 
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@— 10 
X,=0 Bp sonore 
- 2.52B y, = 376B-9 
* 0.76B — 4 * 4—0.76B 
2.52A 7.76A — A? — 10 
Xa 6 ees Ya == ° 
0.76A — 4 4 —0.76A 


A nomograph of any size may be constructed from these equations 
by using an appropriate scale. However, it is simpler merely to enlarge 
Figure 1 by photostat:or photograph, extending the k/Y> line upward 
if desired. 

The error in reading k from the nomograph is on the order of 2 per 
cent for the usual range of k values shown in Figure 1 (4<k/Yo<45) 
using a 7” by 10” chart, but is reduced for all points on the logistic 
curve since they are only fractions of k. These errors are fairly small 
compared with the hyper-sensitivity of k itself to the three subjectively 
chosen points.* The error increases for the extremely high and low 
values of k which are not shown. These should be computed mathe- 
matically. 

The nomograph in general should be more generally used in sta- 
tistics, as it is in engineering. It has these advantages: (1) the chart is 
simple and easy to read; (2) interpolation is made fairly accurately 
along a scale; and (3) the chart shows at any point the nature of the 
change in one variable due to changes in the other variables.® A curved 
nomograph, such as this one, saves even more labor than the more 
usual linear type,'° since it replaces a more complex equation with the 
attendant labor of calculation. 

In case it is desired to fit a logistic curve by using all available points 
rather than only three points, the following short-cut method is sug- 
gested for approximating the upper limit k. This method utilizes the 
principle that the per cent rate of increase in a logistic curve is a linear 


8 For example, if the first two points are 1 and 2, a variation in the third point from 3 to 4 will 
cause k to vary from 4 to infinity. Estimates of the upper limit of California’s population (which is 
nearly an exponential curve) by 18 Stanford students based on various selections of three points from 
the same data varied from 18 to 613 millions. The variation is relatively small, though still considerable, 
in an older series. According to M. C. Rorty (Journal of the American Statistical Association, March, 
1931, pp. 8-9) the growth curve is significant when the series has reached within 20 to 30 per cent of the 
saturation point. 

* Joseph Lipka, Graphical and Mechanical Computation, p. 44 f, Wiley, 1918. 

10 See J. W. Dunlap and A. K. Kurts, Handbook of Statistical Nomographs, Tables, and Formulas, 
Part I, World Book Co., Yonkers, New York, 1932, for an excellent compilation of linear statistics] 
nomographs. 
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function of the population itself, and that the rate approaches zero as 
the population approaches k. Compute by slide rule the percentage in- 
creases in the data from point to point. Then estimate the geometric 
mean of each pair of points from the middle of the line connecting the 
points on semi-logarithmic paper.'' Now plot the percentage increases 
on the Y axis against the corresponding means on the X axis in a 
scatter diagram. Fit a straight line to these points by inspection or by 
least squares. (The goodness of fit of this line is a test of the validity of 
the logistic.) This line extended will cross the X axis at the desired 
value of k. However, as Yule said,’ “I do not think this is a very good 
method of fitting.” The graphic form of the method described above is 
shorter than Yule’s or Hotelling’s versions and is about as accurate as 
Yule’s; but it still involves several approximations, and is much more 
work than the use of the nomograph with three points only. 


THE LOGISTIC GRID 


The logistic grid not only provides a linear transformation of the 
logistic curve, but may also be used to find the remaining constants in 
its equation, if desired, as follows: 


1 


b= — 
X50% — X28.89% 





as bX 50% 


where the X values are the abscissas corresponding to the Y/k values 
of 50 per cent and 26.89 per cent respectively." 

Any logistic scale may be computed as follows: For all desired scale 
values of Y/k up to 50%, list the corresponding values of log, (k/ Y —1) 


11 More precise but laborious methods of computing these values have been suggested by Yule 
(Journal of the Royal Statistical Society, 1925, pp. 1-58) and Hotelling (This Journal, September, 
1927, pp. 283-314). Hotelling shows an illustrative scatter diagram of this general method on page 299. 

12 Op. cit., Method 3, p. 52. 

3 Proof: 1. Let Y/k =50% in the logistic equation. Then 4 = —bXueq, 

1-p 
2. Let Y/k=p. Then a+bX =log, (=) Substituting a above, 


Pp 
1-p 
b=log | —— 
Pp 


Xp—Xug 
1 





1-p i 
. Let oe ( ) =1, Then p = ——— = 26.89% 
Pp l+e 


ob — 


Xuqy —Xs0.00% 
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FIGURE 2 


from tables of reciprocals and natural logarithms (such as Barlow’s 
Tables and the Federal Works Agency, Tables of Natural Logarithms). 
Multiply the results by a constant to obtain the desired scale. Then 
measure these figures as linear distances down from the center of the 
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chart for Y/k values below 50 per cent and up for complementary 
values above 50 per cent. For example, for Y/k=.01 per cent (or 99.99 
per cent) k/Y —1=9999, its log. is 9.210, so this line would be scaled 
9.210 inches or a multiple thereof below (or above) the middle of the 
chart.14 

Again, however, a photostatic enlargement of Figure 2 is a simpler 
method of constructing this scale. 

The accuracy of reading the logistic graph varies with the scale and 
with the vertical position of points on the chart. For a 7” by 10” chart, 
the average error will vary from about .01 of 1 per cent at the top and 
bottom to about 0.5 of 1 per cent at the center. The error will be pro- 
portionally less for Jarger charts. Again, these errors seem small com- 
pared with the approximations involved in the selection of three points 
and the equation itself. The logistic scale, unlike the nomograph, is valid 
for all values of k. 

The logistic grid superficially resembles the probability grid which 
provides a linear transformation of the cumulative normal probability 
curve, derived from the law of errors rather than the “law of growth.” 
The use of the logistic scale will avoid the sizeable error involved in the 
use of the probability scale as a means of approximating the Pearl- 
Reed curve.” 


4 Linear ordinates (r) for integral percentages of Y/k (y’) from 1 per cent to 99 per cent are listed 
by Yule, op. cit., p. 48. 

% For example, Croxton and Cowden, op. cit., pp. 458-461, reach an upper limit of 400 millions 
for U. S. population from probability paper, compared with 190.8 millions from the Pearl-Reed equation 
and their consensus of “well under 200 millions” from other sources. The logistic function is claimed 
to be superior to the probability function as a measure of growth by C. P. Winsor (Journal of the 
Washington Academy of Sciences, February 19, 1932, pp. 73-84) and by J. Berkson (Journal of the 
American Statistical Association, September, 1944, p. 357 f). It is criticized by A. L. Bowley (Journal 
of the Royal Statistical Society, 1925, pp. 76-81), G. R. Davies (Journal of the American Statistical 
Association, 1927, pp. 370-374), and G. H. Knibbs (Journal of the American Statistical Association, 
1926, p. 381 f and 1927, p. 49 f). 
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Bevezetés a Statisztika Tudoma4ny4ba: Rész I. (Introduction to the Science of 
Statistics, Part 1.) Gyula Abay. Pécs, Hungary: Rékéczi Nyomda (Rakéczi- Ot 
35), 1945. Pp. ii, 98. 

Statisztika Médszertani Alapvetés. (Elements of Statistical Methodology.) 
Loraénd Schweng. Budapest, Hungary: Stephaneum Nyomda, 1944. Pp. xviii, 
366. 


REvIEW BY EvGENE LuKacs 
Professor of Mathematics, Our Lady of Cincinnati College 
Cincinnati, Ohio 


fgets textbook, intended for students of law and social sciences, is di- 
vided into five chapters. Chapter 1 surveys the various fields of applied 
statistics and classifies social statistics into statistics involving population 
problems, economic statistics, and statistics related to public health and 
education. Chapter 2 discusses the history of statistics with particular refer- 
ence to Hungarian statistics. Chapter 3 treats the gathering of statistical 
data and the work of the official Hungarian statistical services. Chapter 4 dis- 
cusses the organization of the data and the construction of empirical fre- 
quency distributions; numerous examples and graphs are given. Chapter 5 
deals with measures of central tendency and dispersion. 

Although Abay states that it is impossible to discuss modern statistics 
without using mathematical methods, he avoids the use of mathematical 
formulae. There are a few incorrect statements: for example, on pages 9 and 
10, the author states that every frequency curve would be a normal curve if 
one only had enough data. Statistical methods developed within the past 
forty years are not considered. The book is not likely to lead the student to a 
proper appreciation of statistical methods. 

Schweng’s text—prepared for students of social sciences, especially eco- 
nomics—discusses in great detail methods of collecting and organizing sta- 
tistical data. The first part of the book deals with the collection of data and 
the nature of statistical tables. The proper consideration of significant fig- 
ures is greatly emphasized. The second part discusses graphs, diagrams, 
ratios, and percentages. The third part presents an elementary discussion of 
frequency distributions and measures of central tendency and dispersion. 
Very few formulae are used, and no proofs are given. The normal distribu- 
tion is barely mentioned. Parts 4 and 5 give brief introductions to time series 
and index numbers. Part 6 surveys the most important Hungarian index 
numbers. 

Numerous examples are given throughout the text. Statements and defi- 
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nitions are always illustrated by tables and graphs. The 5-page bibliog- 
raphy lists a large number of American, British, French, German, and 
Italian books. The book does not reflect the progress of theoretical statistics 
during the last forty years and therefore can not be considered an introduc- 
tion to modern statistical methods. Nevertheless, the book may prove use- 
ful for the consumer of Hungarian statistics rather than for the producer of 
statistical data. 


College Algebra. A. Adrian Albert (Professor of Mathematics, University of 
Chicago). New York 18: McGraw-Hill Book Co., Inc. (330 West 42nd St.), 
1946. Pp. xii, 278. $2.75. (London W.C. 2: McGraw-Hill Publishing Co. Ltd, 
[Aldwych House, Aldwych]. 14s.) 


Review sy Paut 8. Dwyer 
Professor cf Mathematics 
University of Michigan, Ann Arbor, Michigan 


HIs book incorporates a drastic revision of the conventional approach 

and treatment of the subject matter of college algebra and as such 
deserves the attention of statisticians. Important objectives of the author 
are: 

a) The material should be presented as a unified and compact body of 
mathematical theory. As worked out by the author, this unity is achieved 
by the successive study of such general topics as the number systems of ele- 
mentary mathematics, polynomials and allied functions, algebraic identities, 
equations and systems of equations. Considerable material not usually pre- 
sented in a course in college algebra is included in this approach. 

b) The definitions and theorems should be stated accurately. This means, 
of course, that the author has been quite formal in the presentation of the 
material. This formality will probably be approved by mathematical statis- 
ticians. It may not be so welcome to other statisticians who wish to study the 
main results with as little of the vocabulary and notation of the mathe- 
matician as is possible. 

c) Proofs of results should be given only when it is probable that the better 
students will understand them and when their inclusion will add to the 
understanding of the results. The author is very careful to indicate each 
statement which is lacking in proof. In this connection, the final chapter 
(Chap. 10) on matrices and quadratic forms should be noted since the 
author makes no pretense at proofs here. This chapter, frankly experi- 
mental, is for the purpose of providing matrix material for those who un- 
derstand the definitions, notation and statements of theorems. As such it 
will be of interest to many statisticians. 

d) There should be an adequate number of illustrative examples and 
exercises. The author has provided these and it seems that he has provided 
another feature which will be of interest to applied statisticians—a greater 
emphasis upon numerical work than is found in most texts on college algebra. 
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e) The text should abound with additional material on the same subject 
for the better student. The author has accomplished this with the use of ad- 
ditional sections and chapters which are indicated as “full course” and which 
tend to amplify the minimum treatment. In some of this optional material 
the author attempts to relate the subject matter of college algebra to other 
material in college mathematics. Chapter 8, “Vectors in the Plane” is of this 
sort. 

The statistician will be particularly interested in the treatment of Chapter 
9, “Matrices, Determinants and Linear Systems,” and Chapter 10, referred 
to above. He may or may not approve of the introduction of the concept of a 
matrix before that of a determinant, but he certainly will approve of the use 
of elimination methods, rather than determinants, in solving linear simul- 
taneous equations. It is unfortunate that the author does not also show that 
elimination (condensation) methods can be used in evaluating determinants 
and, in fact, that the same general procedures can be employed in the for- 
ward solution no matter whether one solves equations, evaluates determi- 
nants, or computes the inverse of a matrix. The author does give a general 
treatment of determinants of any order. 

Chapter 10 has been described above. It gives a number of basic theorems 
and techniques which are commonly useful, together with simple problems. 
It is unfortunate that the definition of matrix multiplication is given only 
in terms of the formal row-by-column multiplication of A and B and is not 
extended to the frequently useful column-by-column multiplication of A’ 
and B or the row-by-row multiplication of A and B’. 

The first copies of the book contained an unusually large number of mis- 
takes not only in the answers to the problems but in the theory itself. Most 
of these have been removed in the second printing. 

The author is to be commended for the way in which he has accomplished 
his objectives. Each reader must determine for himself whether these are 
appropriate objectives for him. I think that we as statisticians will agree 
that the emphasis upon numerical work, the more practical approach to the 
problem of simultaneous linear equations, and the inclusion of matrix ma- 
terial are all steps in the right direction. 


A Concise Manual of Statistics: With Special Reference to the Requirements of 
Students for Municipal Examination, Second Edition. Clement Burton (Borough 
Treasurer’s Department, Hackney Town Hall, London E. 8, England). London 
E.C. 2: Gee and Company (Publishers) Limited (27-28, Basinghall St.), 1946. 
Pp. 182. 15s. 


REVIEW BY Harry PELLE HARTKEMEIER 
Professor of Business Statistics, University of Missours 
Columbia, Missourt 


pps preface of this book contains the statement: “The increasing im- 
portance of the subject of statistics in commercial and municipal activity 
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has prompted the preparation of this handbook in the hope that it will prove 
of interest generally, and will be of assistance to those who are studying for 
the examinations of the various professional bodies of accountants, and more 
particularly for those of the Institute of Municipal Treasurers and Account- 
ants (Incorporated).” In England it is necessary for a person to pass an 
examination in the field of statistics before he is allowed to practice with 
something equivalent to our C.P.A. This is an excellent practice and the 
authorities responsible are to be congratulated. It would be a very good idea 
if accountants in the United States were required to demonstrate a similar 
knowledge of statistics before they are recognized as Certified Public Ac- 
countants. Most accountants are deficient in a knowledge of sampling tech- 
niques and tests of significance or reliability. 

This book is an excellent one in view of its purpose as a manual or hand- 
book. It will serve an accountant well when he wants to refresh his memory 
about a statistical formula or process. Presumably, the accountant has al- 
ready known something about statistics and needs primarily a rapid review. 
This book is not so well suited to be an elementary textbook for a new stu- 
dent to use to obtain an introduction to statistics. One book could hardly be 
both an elementary textbook and a concise manual or handbook. 

The book seems a bit jerky at some points. One gets the impression that 
it was written a little longer and parts were deleted to shorten the book. 
This could have been the work of some editor not familiar with the material 
or subject. At most places the material reads smoothly and for an American 
reader the book has some delightfully surprising; expressions. For example, 
rounded off data are items “shorn of digits.” 

Everyone has the right to select his own symbols, but the American reader 
is not apt to like the choice of a for the arithmetic average, Z for the mode, 
and M for the median. The usual meaning for the probable error is found on 
pages 123 and 124 in connection with the explanation of correlation, but there 
may be some confusion when the reader compares this meaning with the 
statement on page 29, “When figures are taken, for example, to the nearest 
100, the maximum possible error is 49; but as the error may be anywhere 
between 0 and 49 the average or probable error is 24.5.” 

On page 124 the reader encounters eight statements dealing with the 
interpretation of the value of r. Two of them are: “If r is less than .3—little 
evidence of correlation.” “If r is more than 3 times the probable error—evi- 
dence of correlation.” No mention is made of the possibility that r could 
easily be 3 times the probable error and be less than .3. The two different 
conclusions conflict. What could be said about such a situation is that when 
r is more than 3 times the probable.error there is reliable evidence that some 
correlation exists but when r is also less than .3 the correlation is not large 
enough to be of practical value for estimating the size of one variable from a 
knowledge of the other variable and the equation of relationship. 

Very careful type setting or very good proofreading was responsible for 
the fact that very few errors are noticed. 
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One of the principal virtues of the book is the honest recognition of the 
value of the work of the statistician. “The position of the statistician lies 
between the accountant or cost accountant and the economist or executive 
authority.” No doubt this will sound like heresy to accountants in the United 
States, who for years have claimed the position of being the right-hand man 
of the executive. At long last, people are beginning to realize that accounting 
is but a part of the much broader science of business statistics. 


Mathematische Erblichkeitsanalyse von Populationen. Gunnar Dahlberg (Pro- 
fessor and Head of the Swedish State Institute for Human Genetics and Race 
Biology, Stockholm, Sweden). Aus Dem Staatlichen Institut fiir Menschliche 
Erblichkeitsforschung und Rasenbiologie zu Uppsala. Diese Arbeit erscheint 
auch als Supplementum CXLVIII von Acta Medica Scandinavica. Uppsala, 
Sweden: Almqvist & Wiksells, 1943. Pp. 219. Paper. 


REVIEW BY OyYSsTEIN ORE 
Professor of Mathematics, Yale University 
New Haven, Connecticut 


N THIS work the author gives a review of the applications of mathematical 

methods to certain problems of inheritance. Particular emphasis is placed 
on the consequences for human society of some of the mathematical theories 
and hypotheses and almost all illustrations are drawn from this sphere.A basic 
knowledge of genetical terminology is presupposed. The introductory sec- 
tions are devoted to a discussion of the use of the elements of probability to 
the determination of the various gene combinations and their frequencies 
as they may occur in a population. In the later sections considerations and 
methods of an actuarial nature predominate. Here one finds an extensive 
analysis of the effect which certain conditions in the population, for in- 
stance occurrence of mutations, existence of isolated groups and selection 
of various kinds, e.g. mating with closely related individuals, will eventually 
have on the gene frequencies in a society. Of particular value are the numer- 
ous tabular computations which illustrate the consequences of the various 
numerical assumptions one may make in regard to the perturbing factors. 


Cycles: The Science of Prediction. Edward R. Dewey (Director, Foundation for 
the Study of Cycles, 274 Madison Ave., New York, N. Y.) and Edwin F. Dakin 
(Counsel in Public Relations, Plymouth, N. H.). New York 10: Henry Holt & 
Co. (257 Fourth Ave.), 1947. Pp. xi, 255. $3.00. Two reviews follow. 


Review BY Mitton FRIEDMAN 
Associate Professor of Economics, The University of Chicago 
Chicago, Illinois 
_— are, say Dewey and Dakin, a number of important rhythms or 
cycles in economic activity. These cycles have fixed periods and are com- 
mon to many different activities. Any particular economic series is to be 
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conceived as the sum of such cycles superimposed on a trend and dis- 
turbed—though only temporarily—by such erratic and random factors as 
wars. The appropriate technique for predicting the future course of any 
economic series is to determine empirically from its past behavior its trend 
and the chief cycles it contains, to synthesize these, and extrapolate them into 
the future. Dewey and Dakin conclude from analysis of this type so far per- 
formed that the four most important of the many rhythms have been isolated 
in our economy are: 
A. The 54-year rhythm in wholesale prices and industrial innovations. 
B. The 9-year rhythm in wholesale prices, security prices, pig iron pro- 
duction, and industrial activity. 
C. The 33-year rhythm in security prices and in business activity, whole- 
sale and retail. 


D. The 18-year rhythm in real estate activity, and in related industvial 
enterprise (p. 135). 


These rhythms operate in an economy that is growing at a declining rate and 
is approaching “maturity,” so that their downward phases will appear rela- 
tively more severe, and their upward phases less buoyant, in the future than 
in the past. The 54-year rhythm turned down in 1925, and “is due to reach 
bottom in 1952” (p. 226); the 9-year rhythm was due for a peak in 1946 and 
a subsequent decline until 1951; the 33-year rhythm was due for a peak in 
1947 and again in 1950 and for lows in 1948 and 1951; the 18-year rhythm in 
real estate activity “apparently reached its high about late 1942” and is 
“due to decline to a low around 1953” (p. 227). “The foregoing is not meant 
to suggest that our rhythms indicate collapse of our economy. We may only 
infer, if they continue, that around 1947 a fall-off will start, and our economy 
will experience rather protracted declines in prices, production, employment, 
and economic activity generally, reaching a bottom that would presumably 
be dated sometime in the early fifties” (p. 230). 

The book presenting this message is difficult to classify. It is not a scientific 
book: the evidence underlying the stated conclusions is not presented in full; 
data graphed are not identified so that someone else could reproduce them; 
the techniques employed are nowhere described in detail. Yet it is not a popu- 
lar book: there are too many graphs and too much technical jargon to call it 
that. Its closest analogue is the modern high-power advertisement—here of 
book length and designed to sell an esoteric and supposedly scientific product. 
Like most modern advertising, the book seeks to sell its product by making 
exaggerated claims for it (that it will give sufficiently accurate forecasts to 
contribute substantially to success in business), showing it in association 
with other valued objects which really have nothing to do with it (established 
cycles in physical, meteorological, or other natural phenomena, and the 
successful use of cycles in predicting such phenomena as tides), keeping 
discreetly silent about its defects or mentioning them in only the vaguest 
form (no examples of past failures of cyclical patterns to reproduce them- 
selves are mentioned), and citing authorities who think highly of the 
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product. As with most modern advertising, the reader will find the book easy 
and interesting to read; when he finishes, he will find himself favorably in- 
clined toward the product; yet, when he reflects, he will find that there is 
little reason why he should be. What is good is commonplace, and what is 
claimed as exceptional is unproven. There is little doubt that approximate 
rhythms exist in economic activities, that there is a building cycle in the 
United States about 18 years in length, and a business cycle about 3} years 
in length. There is also little doubt—though the authors do not mention this 
—that every attempt to use these facts or any others to forecast economic 
activity has to date met with failure. Beyond this, we know little. The 
authors assert a good deal more, but these assertions, though illustrated by 
selective examples, are not demonstrated by any comprehensive or carefully 
presented body of evidence. Even the techniques used to derive the particular 
conclusions about lengths of cycles and the like are only alluded to in passing 
in a chapter (Chapter 11, “Analysis and Synthesis”) which is clearly in- 
tended to impress the reader with the scientific magic used to get the stated 
results rather than to initiate him into the art or to enable him to test the 
results obtained by the authors. 

An excellent, though somewhat extreme, example of the level of analysis 
in the book is its treatment of secular trends. The book makes much of the 
universality of retardation in the rate of growth of individual industries. 
From this well established phenomenon, it erroneously deduces that the 
economy at large must grow at a retarded rate and approach maturity. And 
it makes this erroneous deduction despite a reference (in which, incidentally, 
the title is given incorrectly) to A. F. Burns’ Production Trends in the 
United States Since 1870 as “required reading” (p. 52). Yet-one of the most 
important conclusions of Burns’ book is that retardation in the rate of growth 
of the economy as a whole is not a necessary result of retardation in the rate 
of growth of each industry separately; and, further, that there is little evi- 
dence in American data of the existence of retardation for the economy as a 
whole. 


REVIEW BY Max Sasuty 
Research Associate, The Robinson Foundation 
Washington, D. C. 


5 gee is an important, certainly a significant book. We may discount the 
jacket statements that it 
establishes the ground for a completely new approach to the economic prob- 
lems of our era. It puts a tool inthe hands of businessmen, statesmen, and 


serious readers which will enable them to predict the economic outlook for 
the critical years ahead. 


This is over-optimistic, even for a jacket blurb. But the book is significant 
as a contemporary sociological document, emphasizing the difficulties of de- 
veloping a realistic economics in the service of society. 
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The authors, according to the jacket, are experienced in counseling 
“corporate enterprise.” The book is in fact a model of skillful writing. A good 
deal of information is given about “cycles,” but not more than confused 
intimations about the science of prediction. The content of the book boils 
down to the quest for a science of prediction—by business counselors. The 
title and the style are, however, very effective. Backed by skillful publicity 
in the preceding years, the book has been a best-seller for months! The ob- 
jective of the reported research is not abstract theory; it is the age-old quest 
of “Forecasting for Profit.” The book may contribute to this aim some- 
what, indirectiy. Indirectly it may contribute to the progress in cycle re- 
search by high-lighting the gulf between the demand for realistic economic 
analysis and the generally barren theory and disputation available as supply. 

The book appears to be a progress report of “a non-profit corporation” 
organized as the Foundation for the Study of Cycles, of which Mr. Dewey is 
director. The early chapters treat of trends and patterns in typical industries, 
Well-known incidents of recent decades in commerce and industry are pre- 
sented in the jargon of the financial and commodity marts with disarming 
easy simplicity. The next few chapters deal with rhythmic cycles. Certain 
cycles of nature, some more or less recently noticed, are first discussed. 
Then a chapter each is given to certain marked eccnomic cycles that have 
become familiar in recent literature—the cycles of 54, 9, 33, and 18 years. 
The final chapters treat of purported causal and correlation analyses of eco- 
nomic series; the effects of the War; and probable postwar trends and 
rhythms. In three short appendices are presented—as tools for the reader (?) 
—the rudiments of the “ratio” scale, moving averages, and the “section 
moving average.” No index is supplied. 

Very effective advance publicity was given the book in a popular article! 
in 1943. This led Smith and Duncan? to the belief—in 1944—that: 

So general was the interest in cyclical behavior [after the 1930’s] that 
by 1940 the Foundation for the Study of Cycles... proposed to help in 
the task of integrating the work of the thousands of scientists and statis- 
ticians who are contributing in various fields to the study of cycles. Not 
only have cycles been found to exist in the realm of business activity, but 


scientists in many other fields believe they have discovered cyclical be- 
havior in their respective studies. 


Dewey stated, in his 1943 article, “the results are most encouraging. Fore- 
casts made by projecting well-established cycles are coming true every 
day.” He predicted: 


Figure on the possibility of a major financial and economic setback start- 
ing about 1947. ... Heavy industry will begin to slack off next year [1944] 
if the 41-month rhythm of peaks and valleys prevails over this war as it did 
during the last one. 


In the present book, in 1947, Dewey and Dakin purport to do more than 
1 Dewey, Edward R. “Science Predicts the Future.” Am Mag 136:90-2 S °43. 


2 Smith, James G., and Duncan, Acheson J. Elementary Statistics and Applications: Vol. I, Funda- 
mentals of the Theory of Statistics, p. 652. New York: McGraw-Hill Book Co., Ino., 1944. 
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“integrate” the work of numerous others. They believe that [like nuclear 
physics] 


economics is now reaching a point where it can hope also to make rather 
accurate predictions, within limits which this study will explain (p. vii). 

The reader will be introduced to a method of thinking about the future 
which—new though it may be to him—seems definitely to have proved of 
value. It is this method which is of fundamental importance—an im- 
portance greater than any specific conclusions to which it may lead. For on 
its validity depends the whole value of the conclusions (p. viii). 


The authors then record their deep indebtedness 


to those whose names, equations and graphs line the pages of this book— 
and to many others unnamed—.... Theirs is the pioneering that is moving 
economics out of the blind alley where it stood for many years, so that it can 
take its rank as a true science (p. viii). 


The substance of the book is in fact largely derived from available familiar 
studies on cycles: Persons, Mitchell, Schumpeter, and especially W. I. King 
on economic cycles; Harlan Stetson, and others on sunspot and other cycles 
in nature. The authors, however, read much more of an inexorable periodic- 
ity into the work of these well-known scholars than is warranted. The dis- 
turbing implication of this determinism is explained away (p. 98) by reason- 
ing that so far has proved inscrutable to the reviewer. Indeed, elsewhere 
the authors aver blithely that social forces—at least quasi-volitional—can 
thwart, or reinforce the normal pattern of the unfolding cycle (p. 113)—this 
with reference to comments on the “inflationary forces” influencing us at 
present (1947). Some lingering doubts as to the consistency of their position 
are disposed of by a singular argument: one must not “reason solely on the 
ordinary cause-and-effect basis’ in business, or in general economic analyses, 
(p. 114). The authors purport to derive support for this dialectic handspring 
from some of Eddington’s lay reflections (in his Gifford lectures, 1925), and 
from nebulous references to Ouspensky, Spengler and Jeans. 

The principal defect of the authors’ discussion is a misreading of their 
primary sources, and misinterpretation of others. They also omit reference 
to outstanding works that quite invalidate their position. Thus, Harlan 
Stetson, a veteran physicist astronomer, discusses sunspots and their influ- 
ence in a number of studies extending over twenty years, and clearly shows? 
that the 11-year cycle, so striking in appearance on the graph, varies widely 
in length—10 to 16 years—and has no more scientific validity than the 
periods of 23, 37, 77, 83, and more years found by other competent in- 
vestigators. Mitchell and his coworkers have studied hundreds of economic 
series during the past three decades. In the last annual report‘ of the Na- 
tional Bureau of Economic Research, Burns demonstrates again that recur- 
rences may always be expected, but, so far, they cannot be predicted. The 


? Stetson, Harlan. Sunspots and Their Effects, pp. 162-72. New York: McGraw-Hill Book Co., 
1930. 

‘Burns, Arthur F. Stepping Stones Toward the Future. Twenty-Seventh Annual Report of the 
National Bureau of Economic Research. New York: the Bureau, March, 1947. 
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repeated failures during the past century of hypotheses suggesting periods 
from forty months to sixty years in economic series are stressed anew by 
Burns (pp. 3-5). These cannot be blandly explained away by the Dewey- 
Dakin Chapter 13, on “Avoiding Some Economic Illusions.” It is impres- 
sive, but neither relevant nor conclusive, when they quote in their argument 
this sententious aphorism from Spengler: 

The separate sciences—epistomology, physics, chemistry, mathematics, 


astronomy—are approaching one another with acceleration, converging 
toward a complete identity of results. 


Their case would not be bettered if, as the authors state, “he could well have 
added biology, psychology, economics, and sociology to his list” (p. 198). 

From the technical statistical analysis side, the thesis of the authors is 
thoroughly repudiated in the extended recent study (not noted by them) by 
M. G. Kendall.§ Kendall’s authoritative analysis is supported by the panel 
of eminent English statisticians discussing his paper. 

The discussion in the book ignores the well-considered, authoritative 
articles in the Encyclopedia of the Social Sciences (“Business Cycles” by 
Wesley C. Mitchell and “Forecasting Business” by Garfield V. Cox) and 
in the Encyclopedia Brilannica, Fourteenth Edition (“Trade Cycle” by 
D. H. Robertson and “Trade Forecasts” by W. I. King). These authors 
rejected without qualification the idea that economic cycles can be predicted. 
Robertson, noting that for England crises occurred in 1835, 1847, 1866, 1873, 
1882, 1890, 1900, 1907, 1913 and 1920, stressed that it is unsafe to dogmatize 
about the length or degree of phase synchronism in the leading industries or 
in leading industrial countries—for any cycle phase. W. I. King, on whose 
The Causes of Economic Fluctuations (1938) Dewey and Dakin lean heavily, 
noted clearly with regard to patterns that frequently repeat themselves, that 
the difficulty is “that in many cases expected repetition fails to materialize.” 
The authors also make extensive use of Edgar Lawrence Smith’s study, 
Tides in the Affairs of Men (1939). They give no inkling that he subtitled his 
book as merely “An Approach to the Appraisal of Economic Change.” Fi- 
nally, it is singular that the authors make no reference to Burns and Mitchell’s 
significant book, Measuring Business Cycles, which was available early in 
1946. Preliminary versions of its content have been available for over ten 
years, e.g., Arthur F. Burns, in a 1936 paper,’ reported that Mitchell’s 
“forthcoming second volume on business cycles” exhihits the markedly 
wide range of variation in length of various economic cycies. 

The wide variations in cycle length is the nub of the aifficulty in the quest 
for means of predicting cycle phases. This difficulty has completely escaped 
the authors’ notice. They draw heavily on Schumpeter’s notable Business 


5 Kendall, M. G. “On the Analysis of Oscillatory Time-Series.” J Royal Stat Soc 108:93-141 '45. 

¢ Burns, Arthur F., and Mitchell, Wesley C. Measuring Business Cycles. New York: National 
Bureau of Economic Research, 1946. 

7 Burns, Arthur F. “The Brookings Inquiry Into Income Distribution and Progress,” p. 513. 
Q J Econ 50:476-523 My '45. 
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Cycles.* Schumpeter gives considerable attention to the variability of cycle 
phases; his last chapter is “The Disappointing Juglar” (the 9-year cycle). 
Schumpeter’s notable idea of innovations—which the authors confuse badly 
with Slutsky’s “random shocks” (p. 145)—is of crucial significance here. The 
authors would find some invaluable suggestions for a pragmatic interpreta- 
tion of the dynamics of cycles—far beyond the ken of the linguistic labors of 
Spengler and Sorokin—in Schumpeter’s idea of the role of innovations in 
generating cycles.® If the authors would recognize that innovations are like 
“driving forces” in vibrating systems, or like perturbing forces in periodic 
orbits, they could serve more effectively the objectives presumably con- 
templated by the Foundation for the Study of Cycles. 

The entire technical apparatus used in this ambitious study is apparently 
revealed in a footnote on page 174. It seems to comprise merely graphical 
juggling of deviations from rudimentary moving average “trends”—direct, 
and rate-of-change. There are ghostly echoes of the “eliminated,” “crossed,” 
and “lagged” trends of the “Quadrature Theory” hoax of the middle 1920’s 
(this JourNaL, March 1924). Naturally the clues to the mechanics of cycle 
generation have eluded the authors. Their avid readers might, perhaps, find 
some light in their quest for a science of prediction in the better treatises on 
vibrating systems, e.g., those of Crandall (1926), or Den Hartog (1940). 
Regrettably, these readers do not appear to have found such light in availa- 
ble recent studies on business cycles. 


Probit Analysis: A Statistical Treatment of the Sigmoid Response Curve. D. J. 
Finney (Lecturer in the Design and Analysis of Scientific Experiment, University 
of Oxford.) Foreword by F. Tattersfield (Head of the Department of Insecticides 
and Fungicides, Rothamsted Experimental Station, Harpenden, Herts, England). 
London N.W. 1: Cambridge University Press (Bentley House, 200 Euston Road), 
1947. Pp. xiii, 256. 18s. (New York 11: Macmillan Co. [60 Fifth Ave.]. $3.75.) 
Two reviews follow: 


REVIEW BY MARGARET MERRELL 
Associate Professor of Biostatistics 
School of Hygiene and Public Health, The Johns Hopkins University 
Baltimore, Maryland 


— word order of the title of this book rather carries the implication that 
probit analysis is a method tied to problems of biological response. Al- 
though the title may not be important, it is mentioned here because it re- 
flects the presentation of the material. The method of analysis is almost 
completely identified with the bioassay problem to which it is applied, and 
vice versa. Thus the book is at the same time a textbook on the problems 


* Schumpeter, J. A. Business Cycles, Vols. 1 and #2. New York: McGraw-Hill Book Co., 1939. 
* Cf. Goodwin, Richard. *Innovations and Irregularity of Economic Cycles.” R Econ Stat 28: 
95-104 My ’46. 
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involved in assaying products through biological responses, and a handbook 
on the solution to these problems by means of probit analysis. In order to 
appraise the book, these two aspects will be considered separately. 

The book opens with a definition and discussion of the term biological as- 
say, as “the measurement of the potency of any stimulus... by means of 
the reactions which it produces in living matter” (p. 1). The type of assay 
most fully treated in the book is that of the all-or-nothing response, such as 
lived or died. With graded doses of the stimulus, changing proportions of the 
test animals will respond and a sigmoid response curve usually results. Around 
this curve, numerous biological and analytical questions arise, concerned 
with both the variation in the product being tested and the variation in the 
biological form used for test. Finney gives an excellent discussion of many of 
these problems. Some of these will be mentioned to give some impression of 
the scope of the book: the determination of relative potency of two products, 
and a consideration of situations in which the term “relative potency” has 
no real meaning; adjustment for natural mortality which occurs independ- 
ently of the product tested; the factorial design of experiments to avoid the 
weakness of considering one at a time, a series of factors which may not 
operate independently; the effect of mixtures of poisons which may be inde- 
pendent and different in their action, similar in their action, or synergistic; 
the case of a graded rather than an all-or-nothing responst. This is a con- 
siderably more comprehensive discussion of the problems than is available 
in other books on bioassay, and will serve a very useful purpose in bringing 
together material that is scattered rather widely through the literature. 

The solution to these problems is oriented to the fitting of the cumulative 
normal curve by means of the probit transformation. The term “probit” 
(contraction of “probability unit”) was introduced by C. I. Bliss in 1934, 
for the deviate in a normal distribution with mean 5 and variance unity. His 
reason for assigning 5 rather than 0 to the mean was a desire to avoid nega- 
tive numbers. 

Finney points out in an historical sketch (pp. 41-47) that the method is 
much older than the name. He traces it back to 1860 and points out that it 
has since then been developed anew around problems in a number of differ- 
ent fields. His presentation of the method is, however, so interwoven with 
the problem to which he applies it that it may be well to state the funda- 
mental steps independent of that problem. These are: (a) determining 
whether a normal curve is suitable for describing the data, and if not, trans- 
forming the scale of measurement so that the distribution is normal; (6) sub- 
stituting for the accumulated proportions the corresponding probits, which 
will then be linearly related to the variate; (c) determining the parameters 
of the straight line relating the probits to the variate; and (d) performing 
goodness of fit tests, and setting up confidence limits about the parameters 
and the line. 

In recent years, J. H. Gaddum, C. I. Bliss, R. A. Fisher, D. J. Finney, and 
others have contributed extensively to refinements in the methods of estima- 
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tion, determination of the sampling variations of the constants, and plans of 
experiments related to this method of analysis. 

The book is designed to explain these developments to people dealing 
with the problem of bioassay. It “is written with the intention of introducing 
the probit method to many who have previously not ventured to use it, and 
of presenting some of its more recent developments to those who are already 
familiar with it” (p. 6). It assumes some but not an extensive knowledge of 
statistics, and is directed primarily to the applied person. The procedures 
are explained around arithmetic examples which will be especially useful in 
acquainting the biologist with the methods. Appendices present a systematic 
treatment of one example, with all steps clearly described, a mathematical 
outline of the theory underlying the method, and a series of very useful tables 
which facilitate the computation. 

Probit analysis is first presented in terms of a graphical method of de- 
termining the parameters of the straight line, and it is stated that this 
method is often adequate and serves especially the person who has a long 
series of estimations to make. The statement (p. 24) that “many experi- 
menters who make use of probit analysis spend time unnecessarily on arith- 
metic when eye estimation would suffice” is unquestionably true, and it is 
gratifying to have this point emphasized. 

The more exact maximum likelihood estimates obtained by a series of 
successive approximations are presented in Chapter 4. The reasons stated 
for using more exact methods are that the observations may be too irregular 
for any confidence to be placed in an eye estimation of the fitted line, or 
differences in weights of the observations may be hard to allow for, or the 
experiment may involve several different factors which are under test which 
call for more involved analytical techniques. The last reason is a sound one 
for using the more involved techniques in certain problems, and the second 
may occasionally be a justification, but the first is dubious. In general it is 
the very good observations, not the crude ones, that justify elaboration of 
the analysis. The fact is, that for an enormous number of problems in bio- 
assay, a graphical fitting of the straight line gives adequate estimates; for 
if the observations are very smooth, the graphical estimates agree well 
with the maximum likelihood estimates, and if the observations are irregular, 
the sampling errors are so great that graphical estimations are within 
sampling range of the more exact estimations, even if they show a sizable 
arithmetic difference from them. This is true in the examples given by 
Finney. 

The question of graphical versus more exact estimates is intimately con- 
nected with the design of experiments, for it involves a balance of gains 
through time spent on refinements of estimates from a given experiment 
versus additional experimentation, at various stages of an investigation. As 
stated earlier, Finney has considerable to say on design of experiments in 
this field, yet he gives no guidance on this point, saying that “Only experience 
of the subject and of the experimental technique used can be a sound guide 
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in this matter” (p. 24). This is an issue on which the statistician is frequently 
consulted, and rightly so, by the experimenter; and the discussion of design 
of experiment should incorporate consideration of this point. 

Methods of analysis of the dosage response curve through other equations 
than the normal curve are mentioned but not seriously considered. It is 
stated (p. 47) that none has been found to compare in usefulness with the 
probit transformation. It is true that none has been as fully explored, and in 
this sense Finney’s statement may be accepted. There is perhaps no reason to 
expect a book on probit analysis to deal critically with other methods. It is 
to be hoped, however, that the virtual identification of the problem of bio- 
assay with a particular statistical treatment will not lead the applied man 
to the view that this is the “correct” analysis to the problem. Referring again 
to the title, it should be emphasized that this analysis is “a statistical treat- 
ment” of the problem, and not “the statistical treatment.” Since it is a widely 
used treatment at the present time, it is very useful to have this comprehen- 
sive summary of the procedures made available, and at the same time, a 
thorough discussion of the problems in one field where they are applied. 


REVIEW BY JoSEPH BERKSON 
Division of Biometry and Medical Statistics 
Mayo Clinic, Rochester, Minnesota 


HIs is a very exceptional if not a unique book. I doubt whether ever be- 

fore an entire volume has been devoted to the statistical treatment of one 
particular curve. Confining the specialization still more, only one aspect of 
the function is considered, for it is not with the normal curve as such that 
this book deals, but only with its integrated form. Singleness of purpose is 
further exhibited by the technic of analysis dealt with, for the author gives 
attention, not to the multitudinous sorts of mathematical methods that con- 
ceivably can be applied to the curve, but to one particular type, which is 
called “probit analysis.” Within the probit scheme he largely confines himself 
to the statistical methods advanced by R. A. Fisher and his students. And 
finally, the data considered are only those of biologic assay. 

Such a concentration of topic and method affords an opportunity for thor- 
ough and careful presentation, and this opportunity the author has embraced 
and exploited effectively. The book as a whole is well planned pedagogically; 
the simpler elements are expertly separated out for initial presentation, and 
step by step the more intricate aspects are dealt with in successive chapters. 
The chapter headings may serve to convey this progress concretely: (1) In- 
troductory; (2) Quantal responses and the dosage-response curve; (3) The 
estimation of the median effective dose; (4) The maximum likelihood solu- 
tion; (5) The comparison of effectiveness; (6) Adjustments for natural mor- 
tality; (7) Factorial experiments; (8) The toxic action of mixtures of poison; 
(9) Miscellaneous problems; (10) Graded responses. In Appendix I is given a 
detailed description, in terms of an example, of the computational proce- 
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dures; and in Appendix II an account of the mathematical basis of the probit 
method. There are ample tables, a good list of references, and a good index. 
I noticed only a few misprints, such as the reference to equation 4.1 on page 
55, which apparently refers to an unnumbered equation on page 53; and the 
year 1940 instead of 1941 given for an article by Garwood. In the list of ref- 
erences the absence of the article by Galton', which presented Shepard’s cal- 
culated table of normal deviates, is conspicuous because it is from this that 
the probit tables used today are in all likelihood derived. 

The basis of the method is the linea: transform of the integral of the nor- 
mal curve, the value of the latter up to X being considered to give the per- 
centage response. The probit Y is obtained by adding 5 to X, in order to 
avoid negative values in the use of the transform. The transformation makes 
it possible to represent the relation of the percentage response to the dose 
by a straight line and to pose the problem of fitting as essentially one of 
linear regression. The use of linear transformation for curve fitting itself is 
not novel and is familiar to everyone, at least in the form of fitting an ex- 
ponential curve by plotting the logarithm. The method was used even for 
the integrated normal curve long before the initiation of “probits.” How- 
ever, rot until Fisher and Bliss devised an appropriate system of weights 
and “working values,” especially the last, was any attempt made to carry 
over exactly into the transformed scale, the computations that necessarily 
are defined in terms of the original values.? 

With a precise linearization thus accomplished, the way was cleared for 
the analysis of biologic assay, the responses of the latter being considered 
to be represented by the integrated normal curve, by applying to it all the 
statistical technics that had been worked out for straight lines. This included 
in its ramifications almost the entire gamut of methods elaborated by R. A. 
Fisher, as the list of chapter headings given previously suggests. Since the 
chapters are comprehensive and clearly written, the volume serves as a good 
text for the use of these methods themselves, apart from their application to 
bio-assay. 

If one considers it timely to offer a didactic text on probit analysis, as here 
delimited, to represent bio-assay, then this book is a job well done. However, 

1 Galton, Francis. “Grades and Deviates: Including a Table of Normal Deviates Corresponding to 
Each Millesimal Grade in the Length of an Array, and a Figure.” Biometrika 5:400-6 Je "07. 

2 The consideration of weights is related to what I believe is an error in the text. On pages 53-54 we 
read: “This quantity is Snw(y —5)?reduced by [Snw(x —2) (y —¥) ]2/Snw(z —2)?; the calculation is shown 
in Table 6, and x? =1.67, which differs from the x? of Table 7 only on account of rounding-off errors.” 


The value of Table 7 referred to, 1.62, is the actual X? of the observations in relation to the fitted curve 

(o —t)? a ‘ P ‘ ‘ 
defined a, In the probit iteration, at each step, we use, for the regression being determined, 
approximate weights and working values derived from the previously determined regression. It is the 
sum of the squared residuals of the working probits around the fitted line thus falsely weighted that is 
given by the expression referred to. The x? directly calculated refers to the actual observations and is 
equal to the sum of their squared residuals with the weights of the present solution. These two sums will 
not be equal, no matter how many decimal places are cerried, until finally when the maximum likelihood 
solution is obtained. In fact a signal, and a useful one, as to whether this point has been reached is just 
this comparison. 
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I myself doubt the appropriateness of such a volume. That the probit is a 
handy and generally good-enough tool for dealing statistically with the data 
of biologic assay need not be doubted. If easily applied methods for fitting 
are utilized and if the associated formulas are considered as suggestive only 
and are used with ample safety factors, no harm, and very probably much 
good, can come of its use. But the impression often conveyed by its devotees 
that this particular transform is a thing unique, possessing some intrinsic 
superiority, and that the methods used are mathematically “exact” and there- 
for unexceptionable, is open to question. 

Out of some 200 pages of text, the author devotes only about one to curves 
other than the integrated normal curve and says: “None of these relationships 
has yet been found to compare in usefulness with the probit transformation.” 
It would be more accurate to say that no comprehensive examination has 
been attempted of the comparative merits of different transforms and that 
such comparisons as have been made have not been examined adequately 
if at all by the author of the present text. Some studies indicate that there 
are certain advantages in other transforms. The angular transformation, 
for instance, has been found simpler to fit* and the logistic not only easier to 
fit but better in accord with bio-assay data.* Before a particular curve is 
adopted as the statistical norm, do we not need to know whether it repre- 
sents the actual physicochemical mechanism of the bio-assay reaction; that 
is, do we not need a study of the differential equations representing the theo- 
retical mechanisms involved? As a minimal responsibility it would seem that 
the author of so definitive a text should himself have ascertained which of 
several proposed curves fits the data best, before committing a public that is 
gullible with regard to any statement printed in a textbook to the curve that 
happened to have emphasis in his statist‘cal schooling. In fairness, however, 
it should be said that the methods worked out for the probit transform can, 
with appropriate modification, often be applied to other linear transforms, 
should it be desired. A statement to that effect would, I believe, strengthen 
the present work. 

The sectarianism shown by the author in regard to the choice of curve is 
exhibited likewise in respect of the statistical methods advanced for apply- 
ing it. The method advocated for fitting the curve is that of maximum likeli- 
hood, as proposed by R. A. Fisher, rather than least squares, on which most 
of us oldsters were brought up. For the situation of biologic assay as dealt 
with here these two methods do not give the same estimates, as the author 
clearly recognizes (p. 54).5 An example for which I have made some neces- 

* Knudsen, Lila F., and Curtis, Jack M. “The Use of the Angular Transformation in Biological 
Assay.” J Am Stat Assn 42:282-96 Je '47. 

4 Berkson, Joseph. “Application of the Logistic Function to Bio-Assay.” J Am Stat Assn 39: 357-65 
. 7 This apparently has not always been clear to writers on the subject. Thus we find Irwin and 
Cheeseman (Sup J Royal Stat Soc 6:174-85 '39) saying in an article which is often referred to: “... 


as is to be expected from the efficiency of the maximum likelihood solution, which in this case is the or- 
dinary least square fit.” 
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sary calculations is one used by Fisher and Yates‘ and for which Garwood? 
calculated the maximum likelihood solution by the probit method. Four suc- 
cessive approximations were required before constancy of the estimates of 
the parameters was achieved. For the first provisional solution the estimates 
of aand b were respectively 4.600 and 0.600; for the fourth solution they were 
4.566 and 0.713, showing an appreciable difference between the first and last 
estimates. Now the likelihood was highest for the last solution, but the suc- 
cessive values of x? beginning with the first provisional solution were 3.54, 
3.67, 3.77 and 3.79. The x? test is often used as a measure of goodness of fit, 
and its use is in fact a part of the prescribed procedure of probit analysis. 
By this criterion, in the successive iterations in this example we were la- 
boriously moving toward a worse and worse solution! Which solution is the 
“best”: the first, of which the value of x? was smallest; or the last, for which 
the likelihood was largest? Which of the two estimates has the smaller vari- 
ance?’ And does not the use of the x? test-require that no part of any possible 
excess of x? found be due to the method of estimate used? 

These are not impertinent questions; indeed they must be answered before 
we can say that the basic procedure used in probit analysis rests on a logical 
foundation. For the difference between graphic fits by eye and the exact 
maximum likelihood solution is of the same order as the difference between 
minimum x? and maximum likelihood estimates. If we are to proceed from 
fitting by eye, using a rather laborious arithmetic procedure, toward a more 
refined solution, it can only be because we are assuredly moving toward a 
better solution. But if, as is possible, the minimum x? solution is the better, 
then in seeking a maximum likelihood solution we are, as would be illustrated 
in the case of Garwood, actually achieving a worse estimate than the one 
with which we started. 

My own questions regarding the probit method for fitting are for the most 
part on an intuitive and “common sense” basis and are little assisted or em- 
barrassed by knowledge of the higher mathematics. However, misgivings evi- 
dently have been felt by the mathematically mature and competent. Thus, 
speaking of the probit solution, Thompson? said: “A general theorem of con- 


6 Fisher, Ronald A., and Yates, Frank. Statistical Tables for Biological, Agricultural, and Medical 
Research, Second Edition. Edinburgh, Scotland: Oliver & Boyd, Ltd., 1943. 

7 Garwood, F. “The Application of Maximum Likelihood to Dosage-Mortality Curves.” Biometrika 
32:46-58 Ja ’41. 

8 To many the question of which has the smaller variance will appear to be authoritatively answered 
by Finney in the following quotation (p. 209): “Fisher (1922, 1925) has shown that estimates of the 
parameters which maximize the likelihood are efficient in the sense of having minimal sampling variance 
in large samples.” Since for the case in discussion the maximum likelihood and minimum x? estimates are 
different, it would appear that either both their variances are equal or the variance of one is smaller 
and that the one given by maximum likelihood. Inquiry discloses, however, that such simple interpre- 
tation reveals a lack of insight into the delphic implications of English when spoken by mathematicians. 
For it appears that minimum x? estimates are also “efficient” estimates and apparently it is not in con- 
tradiction with the quoted dictum that they may give for samples of say 1,000, estimates with smaller 
variance than those of maximum likelihood. 

* Thompson, William R. “Use of Moving Averages and Interpolation to Estimate Median-Effective 
Dose.” Bacteriol R 11:115-45 Je °47. 
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vergence has not been proved for any set of points (log. Di, pi), or with cer- 
tain specific exceptions, where we are required to fit a given curve or a 
straight line in transformed coordinates in the indicated manner. Indeed, the 
general theorem without such exceptions can easily be disproved . . . and ref- 
erence to it as ‘the exact’ method . . . may lead to a false appreciation.” 

Similar doubts will assail one on careful examination of many other phases 
of the probit analysis. On the basis of practical experience I have always felt 
instinctively that the use of observed values of zero or 100 per cent when 
these are asymptotic values is hazardous in curve fitting procedures. Others 
apparently hold similar views; for instance, Knudsen and Curtis in advocat- 
ing the are-sine transform advise against the use of any observations out- 
side the range of 0.05-0.95. Cannot the basis of this hesitancy to use the ab- 
solute rates be found in mathematical considerations? In assuming that the 
integrated normal curve represents the true quantal responses to increased 
dosage, it is implied that for very large samples, a rate of zero or 100 per cent 
will not occur except with infinite doses. This is of course in marked contra- 
diction with actual observations, in which such rates are always encountered 
with finite and even moderate doses. In his basic article on “The case of zero 
survivors” Fisher! does not evidently make any allowance for this difference 
between theory and fact. But it is not to be taken for granted that this dif- 
ference between the mathematical model and reality is without practical 
consequence, and that it may be disregarded without biasing the estimates. 
In view of the practical worker’s hesitation to use absolute rates, the burden 
of proof would appear to rest with those advocating their inclusion in statis- 
tical schemes. Do we not have the right to require of the mathematical stat- 
istician that he demonstrate that the consequences of likely departures from 
theory are negligible, before developments based on the theory are consid- 
ered definitely established? 

The fallacious practice of applying a test of significance by first offering lip 
service to the doctrine that a nonsignificant P (not low, that is) does not es- 
tablish the hypothesis tested but only fails to disprove it, and then acting 
as if it did establish it, a fallacy that has been warned against repeatedly by 
authorities in mathematical statistics," is illustrated in Finney’s book in the 
use of parallelism of probit lines for estimating relative potency. First paral- 
lelism is assumed in order to make the estimates, then a x? test is performed 
for parallelism on the basis for instance of two degrees of freedom (p. 71). 
Mirabile dictu, the P turns out not significantly small. So the estimates based 
on the assumption of parallelism are accepted. Before such a test can be con- 
sidered to rest on a sound mathematical foundation we must have some 
knowledge of the power of the test relative to the alternative of nonparallel- 
ism. How large a departure from parallelism must exist, I wonder, before the 

10 Fisher, R. A. “The Case of Zero Survivors.” An appendix to “The Calculation of the Dosage- 
Mortality Curve,” pp. 136-67, by C. 1. Bliss. Ann Appl Biol 22:164-5 F '35. 


11 See, for instance, A. Wald’s review of M. G. Kendall's The Advanced Theory of Statistics, Vol. II in 
J Am Stat Assn 42:185-6 Mr 47. 
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x? test, as used here on the basis of two degrees of freedom, will probably 
show a significantly small P? And is it not a fact that departure from parallel- 
ism much smaller than this will render seriously inaccurate estimates of the 
parameters which have been made on the basis of an assumed parallelism? 
Perhaps Finney will have this calculation for us in the next edition of his book. 
Or perhaps he will find that parallelism cannot be tested effectively by means 
of tests of significance based on two degrees of freedom. 

Another matter inviting attention is the tacit assumption made through- 
out the text that there is no error in z, the measure of dosage. Actually this 
is an idealization, for to err is human, even in the direction of the abscissa. 
Do we not need a mathematical investigation to ascertain the possible ef- 
fect of such errors? So far as I know, no such inquiries have been made, but 
it is not hard to see intuitively that such errors, even if moderate in size, 
may play havoc with many of the formulations set forth as “exact” in this 
book. 

Aside from such specific questions, I sense a broad uncertainty as to 
whether the structure of assumptions as a whole underlying the formulas is 
warranted for the data as they actually are encountered. It is notable 
that this entire book contains not a single repetition of experiment to test 
whether the theoretical error formulas represent the real variations which 
occur in repeated trials. My own practical experience, admittedly small, 
indicates that the formulas underestimate the actual errors considerably 
and sometimes vastly; and this conclusion is confirmed by veteran workers 
in the field of bioassay. Sometimes this sort of objection is met with the 
rejoinder that the formulas give only the “statistical” or “sampling error”; 
but, if dosage to patients is guided by measures of error as given by these 
formulas and a patient succumbs from an overdose, it will be little consola- 
tion to reflect that it was not the sampling error that killed him. 

For the reasons cited and other similar ones we cannot be certain that 
probit analysis furnishes the exact method for bioassay nor even that, of 
the methods proposed, it necessarily is the best. At the moment about all 
we can be sure of is that it is the most elaborate. 


Manual de Estatistica. (Manual of Statistics.) Amaro D. Guerreiro. Instituto 
Nacional de Estatistica, Portugal. Lisboa; Tipografia Matemdtica, Lda., 1947. 
Pp. 340. 
Review By Huco MuEncu 
Professor of Biostatistics, School of Public Health 
Harvard University, Boston, Massachusetts 


7 author states that his publication was prompted by the limitations 
of the study of statistics in Portugal, and the scarcity of statistical 
bibliography in Portuguese. He is careful to define the book not as a compre- 
hensive text of statistics but rather as a supplement to courses given by 
statistics departments. 
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Given a book of this type, it is probably too much to expect a critical ap- 
praisal of the methods described since it is in the nature of a laboratory 
manual or of a mnemonic aid in the performance of calculating procedures, 
Even so, it is disappointing to encounter a wide range of statistical method- 
ology without presentation of rationale or caution as to conditions of legitim- 
ate use. It is even more disappointing to find, in the wealth of illustrative 
material, methods applied to conditions not fulfilling the underlying assump- 
tions under which such methods are justified. 

Normal correlation is, of course (and properly, within due limits of cau- 
tion) applied to material whose distribution is not strictly normal. From the 
pedagogic viewpoint, however, it seems to this reviewer that illustrative ma- 
terial should conform as closely as possible to prescribed conditions. It is 
therefore something of a shock to find methods of normal correlation and of 
partial and multiple correlation extensively illustrated by material which is 
inherently nonnormal. Incidentally, Pearson’s “eta” is presented as a measure 
of “curvilinear correlation.” 

To begin with, the author’s presentation of distributions seems confused. 
Concepts of meso-, lepto- and platykurtosis are illustrated by a graph of 
three distributions, all apparently normal but with differing scatter. Later 
on, relationships between quartiles and standard deviations are stated as 
valid for “symmetrical” distributions rather than for normal, and a series of 
measures of asymmetry and kurtosis is presented without discussion. 

The lack of critical presentation is most felt, perhaps, in the long chapter 
devoted to index numbers, though its absence is felt also in the two final 
chapters devoted to curve fitting. The first of these, entitled “Analysis of 
Time Series” is given to methods of fitting curves by differences (the strange 
statement is made that “if first differences are normal, the logistic should 
be used”); by least squares (parabolas to the third order); by selected points 
(logistic); and by least squares of logarithms (exponential and hyperbola). 
Smoothing by the use of moving averages is discussed; cyclic functions are 
nowhere mentioned. 

The final chapter presents the fitting of various frequency functions. Bi- 
nomial distributions are somewhat vaguely linked to the normal curve which, 
for some reason, is transformed so that its maximum ordinate is set equal to 
one, rather than its area. The Pearson system of curves is not explained, 
but Elderton’s table of these functions and their characteristics is included 
and a Type I curve is fitted at some length. 

Three appendices present tables of chi-square, of functions of the normal 
curve (ordinate =1 when z =0) and of the log Gamma function. There is an 
extensive table of contents, as well as indexes of tables and of graphs, but 
there is no alphabetic index. The table of errata fails to mention at least one 
mistake: the writing of the normal curve function on page 240. 

Mr. Guerreiro’s book should be recommended only to such students as 
have already acquired a good background in the use of statistical method 
and who therefore would be able to select, from the material here presented, 
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procedures applicable to the nature and the limitations of their data. Taken 
by itself it would in no way prepare a student to follow a book such, for 
example, as Jorge Kingston’s A Teoria da Indugéo Estattstica. 


An Outline of Stutistics, Third Edition. Samuel Hays (Formerly Lecturer in 
Commercial Subjects, Doncaster Technical College, Doncaster, Yorks, England). 
London 8. W. 19: Longmans Green and Co. Ltd. (43 Albert Drive), 1947, Pp. vii, 
254. 8s. 6d. (New York 3: Longmans Green and Co. [55 Fifth Ave.]. $2.25.) Two 
reviews follow: 


Review BY Haroutp NIssELSON 
Statistician, Bureau of Census, Washington, D. C. 


His is the third edition of an English primer written for “Commercial 

Students in our Technical Colleges” and the “general reader of economic 
literature.” To this end, it is intended to cover “the requirements of the 
National Certificate in Commerce and the various Accountancy examina- 
tions.” 

Roughly two-thirds of the text is devoted to a nontechnical discussion, 
in the traditional spirit, of descriptive methods in statistics. Of the re- 
mainder, half is given over to a consideration of published sources of statistics 
on England and the United Kingdom, and the balance to a set of routine 
exercises (with answers). The principal revisions consist of a new chapter 
dealing with rounding errors, an appendix on quality control, and the in- 
clusion of postwar sources of data. Since the book is almost pocket-size, the 
amount of space that could be devoted to any one topic is fairly limited. This 
is reflected in the progressively sketchier treatment of later topics; in par- 
ticular, time series, correlation and index numbers. However, the style is 
much more agreeable than this brevity would indicate. 

The remaining comments apply to the book as a text in statistics, rather 
than its adequacy for the examinations cited. The chapters on official and 
other published sources of data for England, and an additional one on sta- 
tistical measures (unit costs, turnover, etc.) in common business use, are the 
most valuable sections of the book and should prove very useful. By contrast, 
the technical material is far less satisfactory. In general, the approach to 
statistics is one of merely summarizing a set of data. Hence, it is considered 
a limitation of the arithmetic mean that for discrete variables (e.g., number 
of children per family) it may not “represent an actual item at all”—the 
mean being interpreted as an observation rather than as a parameter of the 
population. There is almost no mention of the role of sampling variation in 
interpreting statistical data, and the choice of the few instances where such 
reference is made is quite unexpected. Thus, a formula is given for the 
probable error of a mean or total due to rounding of the individual figures 
involved, but there is no mention of the probable error of a sample mean 
or total as an estimate of the corresponding quantity in the population 
sampled. (The probable error due to rounding as given is incorrect, and the 
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statement is also made that the probable error is “the most likely value of 
the error to occur.”) Again, a two-page appendix explains the probability 
that a population proportion lies in a specified range based on the proportion 
observed from a sample; and an unsatisfactory cookbook rule is given for 
judging the significance of an observed correlation. The examples cited here 
are typical of a number of misstatements and unguarded remarks on meth- 
odological material which appear at all levels of sophistication. 

For teaching purposes the unconventional terminology is awkward: for 
example, “coefficient of dispersion” for the coefficient of variation, “ratio of 
variation” for the regression coefficient, definition of the first moment of a 
set of numbers as the mean deviation and the second moment as the stand- 
ard deviaticn. Moreover, even for a one-semester introductory course, exten- 
sive supplementation by lectures would be necessary. 

In future editions, the usefulness of this book could be greatly enhanced 
by pruning or eliminating entirely the chapter on rounding errors, and, if 
necessary, issuing the exercises separately. The space gained in this way 
could then be devoted to developing and illustrating the basic ideas of sta- 
tistical inference, and the relations between a sample and the population on 
which it is to provide information. ; 


REVIEW BY Epcar Z. PALMER 
Professor of Statistics and Director of Bureau of Business Research 
University of Nebraska, Lincoln, Nebraska 


7 preface to this little book states that it “is intended primarily for 
Commercial Students in our Technical Colleges and for students inter- 
ested in professional examinations.” It may be described as British in form 
and somewhat American in approach. The total amount of material included 
is scanty compared with most textbooks. The usual topics of collection, 
tabulation, and “diagrammatic” representation are touched upon, as well as 
averages, dispersion, index numbers, and correlation. In the chapter on 
time series most of the attention is given to the seasonal index. 

The book gives evidence of careless preparation. It is stated that a death 
rate of 12.5 means a value anywhere between 12 and 13 (p. 22); a frequency 
polygon is smoothed by the use of a weird collection of curves joining the 
tops of the bars (p. 62); the third moment measure for skewness is given in 
terms of the cubes of the deviations, without indication of summation (p. 
98); moving averages, plotted at the ends of their periods, are supposed to 
illustrate the smoothing process (p. 106); a square root sign appears without 
a figure under it (p. 205). Formulas are a crude combination of words and 
symbols, and fractions are mixed with decimals in a confusing manner. Sam- 
pling and quality control have been relegated to two appendices, the former 
topic including only the sampling error of a proportion, and the latter in- 
troducing for the first and only time the sampling error of the mean. 

It is probable that for the quality of student for which this is intended it 
is quite teachable. The writing is clear, and illustrations of computations, 
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tables, and graphs are generous. The only interest for American readers is in 
five short chapters on the sources of British statistics, including one chapter 
each on vital, employment, overseas trade, and price data. 


Market and Marketing Analysis. Myron S. Heidingsfield (Assistant Professor 
of Marketing. Temple University, Philadelphia 22, Pa.) and Albert B. Blan- 
kenship (Managing Director, National Analysts, Inc., 15th and Locust Streets, 
Philadelphia 2, Pa.). New York 10: Henry Holt and Co., Inc. (257 Fourth 
Ave.), 1947. Pp. x, 335. $3.00. T’'wo reviews follow: 


REVIEW BY LEsTER R. FRANKEL 
Statistician, Marketing and Research Division 
Dun & Bradstreet, Inc., 290 Broadway, New York 8, N. Y. 


oe use and application of marketing research to the soluticn of business 
problems has received general acceptance during the past few years. Since 
many of the procedures used in this field are based upon somewhat special- 
ized statistical techniques, a warm welcome would greet a book, such as this 
one, whose purpose is to elucidate some of these methods. 

In spite of the excellent choice of topics, the orderly presentation of sub- 
jects and the simplicity of exposition, this book, as it now stands, fails to 
serve the above purpose. This results from the fact that the text contains 
an unduly high proportion of mistakes, misinterpretations, and confusing 
statements. Some of the errors can be detected by the bright student or 
astute businessman. Other statements are inconsistent with correct statisti- 
cal usage. Businessmen with some knowledge of accounting procedures will 
find some of the material confusing. Students of marketing who read the 
book and then decide to specialize in statistics will have to unlearn many of 
the techniques which are incorrectly described here. 

The material covered in the text is presented in a logical and systematic 
order within the framework of five paris. The first part, consisting of three 
chapters, is introductory in nature. It discusses the need of research to the 
efficient operation and planning of a business concern and points out that 
sources of information are available from both internal and external sources. 
Internal analysis, it is explained, makes use of current and past operating 
records of the concern. Such records are usually in the form of balance sheets 
and profit and loss statements. In external analysis use is made of surveys, 
statistical series and other sources outside of the concern. 

The second part deals with the use of internal analysis in the solution of 
marketing problems. Although the study of the operations, problems, analy- 
sis, and interpretation of facts pertaining to business concerns is somewhat of 
a science today, the two chapters in this part, “Analysis of Business Records” 
and “The Use of Standards in Internal Analysis,” use business terms and ex- 
pressions quite loosely. The effect is confusing, whether the looseness is due 
to careless wording or to a lack of detailed knowledge of a very exacting 
subject. 
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On page 41, for example, it is said that ratios measure “degree of sol- 
vency” and “percentage of borrowed assets.” “Degree of solvency” is not an 
established term in business or the economics of business, and the text does 
not make it clear whether it means that a particular business enterprise is 
50 per cent solvent, 70 per cent solvent, 90 per cent solvent, or exactly 
what. The text goes on to state that “degree of solvency” can be measured 
by the ratio of “assets versus liabilities.” Here we become still more con- 
fused, as we are not told what assets or what liabilities are to be included. 
‘We learn a little later that what is meant is current assets and current li- 
abilities, although on page 55, where the ratios are listed in fraction form, 
the fraction again is based on the very indefinite terms of “assets” and “li- 
abilities.” The ratio of current assets to current liabilities is no measure of 
such indefinite ideas as “degree of solvency” or “percentage of borrowed 
assets.” It is really a measure of the ability of a concern to meet its immediate 
liabilities on scheduled time, and even as such a measure, interpretation must 
be made with extreme care and from fairly wide knowledge or experience. 
The second expression, “percentage of borrowed assets,” has no meaning. No 
business concern borrows assets. Assets are purchased, and until the pur- 
chase price is paid, the liability for the purchase is on the books. The pur- 
chase might be raw material to be paid in 10 days, or real estate and building 
to be paid gradually over a period of 10 years. 

In several of the ratios which are mentioned, “net worth” is used as a 
component. Net worth is the dollar difference between all assets and all lia- 
bilities. Occasionally a concern has total assets of $1,200,000 of which 
$1,000,000 is an intangible item such as goodwill, trade marks, or formulae. 
If the $1,000,000 is deducted, we arrive at a term of “tangible net worth” 
which should be used in these compilations. Otherwise homogeneity, which is 
described as an absolute essential on page 64, is completely lacking. This 
is a matter of very considerable importance and no explanation of the dif- 
ference between net worth and tangible net worth appears in the text. As a 
matter of fact, the term “net worth” is used exclusively in the text. 

The material presented in Part 3 provides a very good summary of the 
use of sample surveys in marketing research. In many respects this part is 
the best in the book, especially from the viewpoints of clarity and accuracy. 
But, even here, we find some errors. For example, sampling is first introduced 
in the chapter on “Planning the Survey Operation.” At the beginning of this 
chapter (p. 83) we find the statement that, in sample surveys, “The impor- 
tant thing here is to make absolutely certain that those selected to represent 
the total population have all the social, economic, and other characteristics 
which typify the original group.” A few lines later we find the statement, 
“Statistically, this is usually secured by what is called a random selection.” 
Of course, in random selection we are never absolutely certain that the 
sample has all of the characteristics of the population. In random selection, 
we rely upon the statistical law of large numbers to approximate this objec- 
tive. 
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The issue of area versus quota sampling is discussed in the same part. It 
was gratifying to find that through an objective analysis of the two methods 
the authors succeed in conveying to the readers the fallacies of quota sam- 
pling. However, it is unfortunate that the authors persist in using the term 
“stratified sampling” synonymously with “quota sampling.” 

In the remaining chapters of this part we find excellent treatments of the 
problems of questionnaire development, and of the collection and tabulation 
of the data. We are given the benefit of the authors’ wide experience in this 
field. 

“Special Applications of Research Technique” is the title of Part 4, which 
includes a description and evaluation of the methods of establishing sales 
potentials and quotas, and a discussion of some of the problems of radio re- 
search. It is interesting to note that, in commenting upon the use of cor- 
relation analysis in determining market indexes, the authors state (p. 229), 
“This method has the disadvantage of requiring a trained statistician.” 

The part of the book dealing with “Technical Statistical Procedures” is 
the part containing most of the errors. Some of the errors are due to careless- 
ness and others are due to the authors’ eagerness to describe techniques 
which they evidently have never used. The first chapter in this section de- 
scribes fundamental statistical techniques, material which is found in all 
elementary texts on statistics, and yet there are a number of errors. For 
example, in describing the methods of computing the arithmetic mean and 
median, a series of 33 numbers is grouped in the form of a frequency table. 
In computing the mean the midpoint of the class interval, say 13 to 15, is 
taken at 14. However, in calculating the median we find the statement, “The 
actual range of the scores represented by the 13-15 class interval includes 
those that are barely above 12 and those extending up as high as 15 but no 
higher.” The class interval has shifted one half of a unit. However, the cor- 
rect answer is arrived at in this particular instance through the commission of 
another, but compensating, error. The position of the median item is com- 
puted by taking one half of the total frequency plus one instead of one half 
of the total frequency. 

There is confusion in the distinction between the standard deviation and 
the standard error. It is stated on page 273 that “The standard error, nu- 
merically, means just about the same thing as the standard deviation.” If 
50 per cent of the public has done a certain thing “there is no way to compute 
a standard deviaticn of such a figure.” The only thing one can do, sav the 
authors, is to get the standard error which is given by the familiar ./pq/n 
formula. The distinction between statistical description and statistical infer- 
ence is not recognized. 

The treatment of “Sample Size and Accuracy” is somewhat puzzling. It is 
emphasized at the outset that statistical formulas for sampling errors can be 
applied only when the sample has been drawn on a random basis. We are 
warned that the formulas do not apply in quota sampling and that modifica- 
tions of the random sampling formulas have to be made when stratification 
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or area sampling has been employed. However, the authors state that once 
data have been collected, regardless of the partigular sampling used, the 
sampling error can be computed by established techniques! They have 
evidently fallen into this fallacy because of their ignorance of the difference 
between the standard deviation and the standard error. 

In a table on page 280 illustrating the cut-off method for determining the 
size of an adequate sample we find that among 50 interviews 49.2 per cent 
answered “yes” to a particular question and that among 100 interviews 50.1 
per cent answered “yes.” Since such results are arithmetically impossible 
when each interview represents a single individual, we are led to believe 
that an interview may represent several individuals. However, there is no 
indication in the text that this is the case. 

On page 290 the following statement appears: “The midpoint between the 
perfect positive and perfect negative correlation is indicated statistically by 
a correlation of .00. This means that there is no relationship whatever be- 
tween the variables being analyzed.” What should have been said is that 
there is no simple linear relationship between the variables. 

It is stated on page 295 that, “Mathematically, it is seldom sound to... 
[compute a linear correlation coefficient] with fewer than 100 cases.” Pro- 
vided the correct test of significance is used, it is sound to compute r with as 
few as three cases. 

On page 305 in discussing the uses of the chi-square test it is sugg 
that, “It might be used to determine the significance of sales estimates ... 
incorporated in the sales budget) against sales returns.” The comparison 
between an actual series and a theoretical series through the use of chi- 
square can be made only when both series consist of frequencies. 

It is unfortunate that this book did not receive a thorough review prior 
to its publication. The layout, presentation, and style are above par for a 
text of this type. However, because of the great number of errors present, 
it is not very well suited for classroom or general use. It is hoped that a re- 
vised edition of the book will soon appear. 


Review By Pari J. McCartuy 
Assistant Professor of Sociology, Cornell University 
Ithaca, New York 


ype topics considered in this book fall into three categories: the use of 
internal company records for market research, the application of survey 
techniques for obtaining data external to the company, and the statistical 
concepts which are frequently required in considering either or both of the 
other two topics. All three are well illustrated with practical examples. The 
Preface states that the book has been prepared with intentional simplicity 
for students in marketing research and for businessmen who buy or other- 
wise make use of commercial research. This, in itself, is a desirable character- 
istic of a book for beginners. However, when considered in conjunction with 
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the fact that much of the material on survey techniques has already been 
presented in a simple and more exhaustive manner by one of the authors,! 
and with the fact that statistical techniques are very difficult to treat in a 
simple manner, this simplicity becomes the greatest shortcoming of the book. 

The discussion of internal company records as a tool for research is an 
innovation for a marketing research book. As such, it should prove particu- 
larly interesting and valuable for those individuals who have entered the 
field of commercial research through academic disciplines other than eco- 
nomics or business administration. However, it is doubtful whether the stu- 
dent of marketing or the businessman will find this approach new. Most 
students of marketing will already have covered the field of internal record 
analysis in more detail in their courses in business statistics and accounting, 
and it seems likely that the businessman would have long since been forced 
into a careful study of the internal records of his company. 

Survey techniques are skimmed over lightly, the presentation being very 
similar to and less detailed than that of the previously noted book by Blan- 
kenship. The authors do maintain a critical attitude throughout this section, 
and their comments on methods and techniques are unusually well chosen 
and to the point. They have perhaps leaned over backwards to stress the 
use of random methods in the design of sampling procedures, but considering 
the widespread use of other methods in marketing research, this can certainly 
do nothing but good. They do attempt to wrap all sampling methods up into 
four neat packages (an impossible task), and they do say nothing about 
how the businessman or the commercial research organization should deter- 
mine the limits of error which can be tolerated in any given survey operation. 
Unfortunately there is very little information on this topic, and some con- 
sidered thought needs to be devoted to it. 

The final topic covered in the book, namely statistical techniques, is an- 
other attempt to convey statistical understanding to persons without any 
mathematical background. The present treatment is not distinguished in any 
way, and when one considers that measures of central tendency, measures 
of dispersion, sample size and accuracy, correlation, analysis of variance, 
analysis of covariance, factor analysis, chi-square and sequential analysis 
are all touched upon in the short space of fifty pages, there is serious reason 
to doubt whether this section will serve any useful purpose. The interested 
reader might better be referred to some such book as How to Read Statistics 
by R. L. C. Butsch (see reviews in this JourNaL, December 1946). 


O Custo de Producio do Homem Adulto e Sua Variacio em Relacio a Mor- 
talidade. (The Cost of Production of the Human Adult and Its Variation in Rela- 
tion to Mortality.) Giorgio Mortara (Technical Advisor, National Census Com- 
mission, Brazilian Institute of Geography and Statistics, Rio de Janeiro, Brazil). 
Estudos Brasileiros de Demografia, Monografia No. 2. Rio de Janeiro, Brazil: 
Livraria Kosmos Editéra, 1946. Pp. 152, 40 cruzeiros. Paper. 


1 Blankenship, Albert B Consumer and Opinion Research, New York: Harper & Brothers, 1943. 
Pp. xi, 238. 
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Review By T. N. E. GREvILLE 
Principal Actuarial Mathematician, National Office of Vital Statistics 
U. S. Public Health Service, Washington, D. C. 


— is the second of a series of monographs entitled “Brazilian Demo- 
graphic Studies,” being published under the auspices of the Getilio 
Vargas Foundation. Pareto in 1893 (Giornale degli Economisti, 1893, 2nd 
half-year, pp. 451-456) discussed the cost of production of a human adult, 
defined as the total cost of upbringing of an entire generation from birth 
to age 15 or 20 (including the children who fail to survive) divided by the 
number of survivors at the selected age. Finding this cost to be only slightly 
less in countries with low infant mortality than in those having high infant 
death rates, Pareto erroneously concluded that a low infant mortality tends 
to be offset by a high death rate in later childhood and adolescence. The 
author of the present monograph conclusively disproves this contention, 
marshaling a mass of data derived from life tables for a number of countries 
at various periods to show that a reduction in infant mortality has invari- 
ably been associated with a corresponding reduction at subsequent ages. 

Postulating a scale of relative living expenses by age increasing in geo- 
metric progression from 26.79 in the first year of life to 100.0 for the twen- 
tieth and each subsequent year, he obtains values, in terms of the conven- 
tional units so defined, for the total cost of production (up to age 15) of a 
human adult in various countries and epochs, including the proportionate 
share of the cost occasioned by those who died prior to age 15. Subdividing 
the result into two parts, corresponding to the proportionate share of the 
cost of upbringing of the survivors and nonsurvivors separately, he shows 
that even in those countries having the highest infant mortality, the latter 
portion of the cost constitutes only slightly more than ten per cent of the 
total. This figure represents, therefore, the maximum reduction in the total 
cost that could, in theory, be effected in a given country by reducing the 
death rate in infancy and childhood. However, if one considers separately 
the percentage change in the portion of the cost which relates to the nonsur- 
vivors, there has been a spectacular reduction. For example, it was 64 per 
cent for white males in the United States between 1900-1902 and 1939. 

On the basis of available data, and making certain assumptions, the author 
attem pts to illustrate his results in terms of concrete monetary values, esti- 
mating the average cost of production of a human adult under the economic 
conditions existing in Brazil in 1939 to have been between 8,700 and 8,750 
cruzeiros. At the rate of exchange then prevailing this would be equivalent to 
approximately $450 in United States money. 

The style is lucid, and the argument convincing. In general, the conclu- 
sions stated appear to be amply supported; in fact, the documentation, in 
some instances, is perhaps unnecessarily full, as some of the main conclusions 
are today hardly open to dispute. The author promises that a subsequent 
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monograph of the series will deal with the average productivity of an adult 
during his lifetime and its variation in relation to mortality. 


Forecasting Sales. G. Clark Thompson. Conference Board Reports, Studies in 
Business Policy, No. 25. New York 17: National Industrial Conference Board, 
Inc. (247 Park Ave.), 1947. Pp. 47. Paper. Price on application. Two reviews 
follow: 





REVIEW BY CHARLES F. Roos 
President, The Econometric Institute, Inc. 
500 Fifth Ave., New York 18, N. Y. 


N? LONGER than ten years ago sales forecasting by most companies was 
confined to tabulating the guesses or hopes of the sales department; as 
late as 1940 only a few companies recognized that their sales are closely 
geared to national or regional purchasing power and other economic variables 
that can be satisfactorily forecast. As the Foreword of Forecasting Sales 
states, 
tntil recent times, most companies have contented themselves with crude 
appraisals of the general business outlook. The gradual development of 
economic and marketing research techniques, however, have provided 
businessmen with an opportunity to make more accurate sales forecasts 
than had previously been possible. Today, numerous companies regularly 
forecast i sales within 10% of actual performance and often come 
within 1%. 
A realization of the value of such forecasts, together with increased con- 
fidence in present-dav forecasting techniques, has encouraged many com- 
panies to attempt to improve their methods or to develop new ones. 


The main purpose of this report is to illustrate forecasting techniques cur- 
rently being used by companies in such diverse fields as food processing, 
metal fabricating and manufacturing, air conditioning equipment, furniture, 
electrical equipment, chemicals, rubber products and building materials. 
The methods illustrated for forecasting sales include the polling of executive 
and sales opinions, correlation analysis, trends and cycles and the use of 
mathematical formulas which last in reality involve the use of statistical 
complexes and accounting techniques. Actual examples are used to develop 
the techniques and, consequently, they are characterized by uneven degrees 
of statistical refinement and imagination. 

The first example presented under the section devoted to statistical meth- 
ods is that of a metals company. Joint study by outside consultants and a 
company specialist revealed a definite relation between the sales of the com- 
pany’s products and various special groupings of industries included in the 
Federal Reserve Board index of industrial production. The example pre- 
sented correlates the unit sales of Product “A” with the Federal Reserve 
Board index of industrial production and the ratio of automobile production 
to this index. Thus, the approach avoids the pitfall of using correlated varia- 
bles in a multiple correlation. The “fit” between actual and estimated sales 
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is quite good, as it should be, because the product is sold to many industries 
but in volume to the automobile industry. This means that if the company 
can secure reliable forecasts of the FRB index of industrial production and 
of automobile production it can make reliable forecasts of the sale of Product 
“A.” The company obtains these general forecasts from a private research 
organization which specializes in this work. 

Another example compares the plumbing sales of a manufacturer with 
construction contracts. The agreement, of course, is generally good. How- 
ever, there are important deviations which seem too large to be ignored. The 
reviewer has found that these plumbing sales can be explained more satis- 
factorily in terms of Supernumerary Income (disposable income less cost of 
subsistence living) and construction contracts. The Supernumerary Income 
is the purchasing power available for purchases of plumbing supplies for 
modernization. This income is sometimes called Discretionary Income. 

The projection of trends is the method used by an electric power company. 
It projects the trends of (1) population, (2) meters or customers, (3) electric 
power consumption per customer, and (4) revenues. The trend of each is stud- 
ied separately and also in relation to the other three. As crosschecks or links 
the company also studies (a) the meter ratio (number of people per meter), 
(b) sales per meter, (c) revenue per meter, (d) revenue per unit, and (e) reve- 
nue per unit vs. consumption per meter. With the aid of mathematical for- 
mulas and a certain amount of judgment the trends of the link factors are 
projected. 

Industry replacement sales of automobile tires are forecast by a mathe- 
matical formula which includes the number of automobiles over two years 
of age, consumption of gasoline per car of these cars, a factor measuring tire 
quality, and sales of camelback, the unvulcanized rubber used for recapping. 
A curve can be readily constructed from life tables to show the number of 
cars over two years old, the tire quality is known, and sales of camelback 
are not large enough to introduce much error. Consequently, the principal 
problem is that of forecasting gasoline consumption. For this purpose the 
company projects the trend of gasoline consumption per car by mathematical 
extrapolation modified to allow for special conditions which will, in the fore- 
caster’s judgment, affect the demand in future years. Much more accurate 
forecasts of gasoline consumption per car can, however, be made by relating 
this variable to disposable income and gasoline price and forecasting disposa- 
ble income. Gasoline price is related to this income, taxes, and refinery costs. 

Some companies use several methods for forecasting sales and then recon- 
cile the forecasts in meetings of executive or planning committees. In such 
instances, the forecast has broad authority and acceptance throughout the 
organization. While it is recognized that the forecast may still be wrong, it 
is accepted as the best estimate the organization can make based on eco- 
nomic and statistical information as checked by the business experience of 
the company’s top officers. 

It is now possible for companies to take the next step and plan sales budg- 
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ets and allocations by counties. The Office Equipment Manufacturers’ Insti- 
tute has arranged with the Social Security Board to tabulate for each county 
in the United States employment, payrolls and the number of estabiishments 
broken down to size groups. These will be provided for the main classifications 
of mining, manufacturing, wholesale trade, law offices, medical and health 
services, and nonprofit organizations. It will now be possible for companies 
to make studies of how their sales gear to these county data and then by the 
use of tabulating machines to set realistic sales budgets by counties and sales 
territories. 

This trend toward better forecasting, in the reviewer’s opinion, will prob- 
ably lessen the intensity of business cycles and will certainly lead to the 
dampening or elimination of seasonals. This is especially important at a 
time when union leaders are beginning to press for guaranteed annual wages. 


Review By H. D. Wo.tre 
Director, Market Research Department 
Colgate-Palmolive-Peet Co., Jersey City, N. J. 


HIs small pamphlet on methods and practices of forecasting sales is one 
Tet the most helpful and illuminating contributions ever made on this 
most vital subject. Much has been written about distribution costs and the 
fact that they may or may not be too high. Thompson, through exposition 
of method buttressed by actual cases of the practices employed by fifteen 
large industrial companies, demonstrates how accurate sales forecasts can 
lower distribution costs. Costs of production, selling prices, and profits are, 
of course, dependent on sales volume. The sales forecast of volume (either 
dollar value or physical volume) affects the proper purchasing of production 
and packaging supplies, the finances needed, transportation facilities for 
which provision must be made, and influences the selling and advertising 
effort that must be expended. An accurate sales forecast can, therefore, result 
in the most efficient utilization (hence the lowest cost) of all the factors in- 
volved. 

There is a minimum of reference to actual statistical methods employed 
to achieve accurate sales forecasts. The reader is referred, in the appendix, 
to some of the standard works on statistical methods and business-cycle 
theories. The value of this pamphlet lies in its collation and description of 
the methods in actual use in 107 companies in the United States and Canada. 
In addition, rather complete case histories of particular methods utilized by 
15 representative companies are presented throughout the pamphlet; empha- 
sis is placed on the importance and the necessity of critically examining 
forecasts based on statistical methods. Judgment, experience, and the test 
of reasonableness must be applied to every forecast before the forecast be- 
comes an official figure of the company. As the author, in analyzing correla- 
tion methods so well puts it, “the more reliable the correlation, the more 
dangerous its use becomes, since there is a tendency to place blind faith in a 
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method that seems infallible. Even the best correlations are subject, to 
chance variations, and one abrupt or severe deviation from normal may be all 
that is necessary to bring severe losses to a company.” 

Every company, no matter whether small or large, and whether or not 
they make official or formal sales forecasts, would be well advised to have 
this pamphlet read by every employee directly concerned with sales, ad- 
vertising, production, or financial budgets. No matter how advanced may be 
the thinking within a particular company with respect to sales forecasting, 
there is much material in the pamphlet which could probably properly be 
used in conjunction with the methods now employed by any such company. 

Specifically, the pamphlet describes the following methods of sales fore- 
casting, which may include national and territorial sales forecasts by product 
lines. 

NonstTaTIsTIcAL Metuops. (a) The Jury of Executive Opinion. Under this 
method, top executives combine their experience and knowledge to arrive 
at an acceptable figure for budgeting sales. (b) The Sales Force Composite 
Method. Estimates are gathered by divisionai managers from each of their 
salesmen and sent to the home office for examination and reconciliation by 
the top sales-management group in the home office. 

SraTisticAL Meruops (Correlation Analysis, Trends and Cycles, and 
Mathematical Formulas). When statistical methods are employed, the au- 
thor specifically recommends that such methods be kept simple as possible in 
order to assure their acceptance by the operating men who are to use the 
forecasts in the conduct of the company’s business. If the factors affecting 
the sales of a particular company are complex and thus require complex 
statistical methods, the author recommends that the forecaster “devise 
means of instilling the necessary understanding and confidence in the ulti- 
mate users.” Constant review and revision of the forecasts must be made 
after the period of forecast has begun in order to check the actual perform- 
ance with the forecasted estimates and to adjust for any change in conditions 
that may have occurred after the forecasts were made. 

MuttieLte Meruop Approacu. Under this approach, forecasts made by 
nonstatistical methods are compared with the forecasts arrived at from 
statistical analysis. The multiple method is highly recommended since it 
combines all the experience and knowledge within the company. As the 
analysis states, “the critical examination of differences between two or more 
independent forecasts sharpens executive judgment and tends to make the 
final forecast more reliable.” 

The only criticism of the pamphlet is the lack of recognition of the use of 
growth curves for short-term forecasts. Perhaps this method has not yet 
found wide usage, although its value has long been recognized for long-term 
forecasts. 

It is hoped that the author will find time to expand the material in this 
pamphlet as more information becomes available, and as the pattern of the 
postwar economy emerges more clearly. It may well be that prewar statisti- 
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cal correlations, based on a 100 billion dollar maximum economy, may not 
continue within a 200 billion dollar economy. 


Forecasting for Profit: A Technique for Business Management. Wilson Wright 
(Economist, Armstrong Cork Company, Lancaster, Pa.) New York 16: John 
Wiley & Sons, Inc. (440 Fourth Ave.), 1947. Pp. vii, 173. $2.75. (London, W.C.2, 
England: Chapman & Hall, Ltd. (37-39 Essex St., Strand], 1948. 16s. 6d.) Two 
reviews follow: 


REVIEW BY WILLIAM A. SPURR 
Professor of Business Statistics, Stanford University, Stanford, California 


HIs is an essay touching on many topics in business forecasting. The aim 
Tis to show the business executive, the business economist and the student 
how to predict a company’s sales, profits, and prices. Toward this end Mr. 
Wright discusses the need for forecasting, types of forecasts, the factors 
which determine sales and profits, a method of making a general business 
forecast, the uses of national income data, the nature of money, the statis- 
tical series used by the Armstrong Cork Company, the effect of competition 
on a company’s outlook, business cycles, “economic models,” mechanical 
forecasting devices, types of price forecasting, the determination of turning 
points in sales and profits, criteria for classifying products, and the final as- 
sembly of the sales forecast. 

Mr. Wright thus points out the principal problems faced by the forecaster. 
His discussion appears plausible and persuasive. His viewpoint is that of a 
practicing business economist. However, I fail to see how one could learn 
how to make a business forecast from reading this book. There are two fun- 
damental criticisms. 

1) The treatment is too superficial. Each problem is merely mentioned in 
a general way, rather than being carefully analyzed and explained. Further- 
more, the topics are heterogeneous and unorganized, and terminology is 
vague. Definitions are fuzzy, and sometimes inconsistent with ordinary us- 
age. One gets the impression that the book was dictated in a very short 
time. As further clues to the author’s thoroughness, the footnotes, bibliogra- 
phy, and index are rudimentary; the few charts show only hypothetical 
data, and the very few tables of actual data are brought up only to 1944. 

2) There are six essential elements in any good forecasting book; none of 
which is developed adequately here: (a) an explanation of pertinent sta- 
tistical methods, (b) business cycle description and theory, (c) a critical ex- 
amination of leading business indexes, (d) analysis of principal noncyclical 
factors affecting business, (e) historical survey of recent economic fluctua- 
tions, (f) a detailed case study of a specific forecast, showing how the above 
principles are applied in practice. A business economist should do this last 
job well, but despite Mr. Wright’s many references to his work at the Arm- 
strong Cork Company, he still fails to get across to this reader just how they 
make a forecast. (The treatment in pages 133-36 is just a start.) 
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As one illustration of both these criticisms, the statement is made on page 
83, that, “ . . . it is helpful to oreak down or decompose by statistical manip- 
ulation an index of business activity into component elements.” Very well, 
then what methods of time series analysis are most appropriate and how may 
they best be applied in practice? Mr. Wright’s only explanation is, “Statis- 
tistical decomposition involves calculating the various equilibrium values 
in a series and the mathematical difference of each.” 

The reader, however, should look at both sides of the matter by contrast- 
ing this review with one which appeared in the American Economic Review 
for December 1947 (pp. 974-77). The reviewer there outlines Mr. Wright’s 
thesis at some length and calls it “competent and thorough... well or- 
ganized, logical.” 

To resume: if this treatise is lacking, then what is a good book on business 
forecasting? The prewar books are now rather dated, unfortunately, because 
of the rapid rate of obsolescence in the data, methods and strategic factors 
used in forecasting. Three books have been published since the war: (1) 
Dewey and Dakin’s periodicity study, Cycles: The Science of Prediction 
(Henry Holt and Co., Inc., 1947),which is definitely misleading, in my opin- 
ion. (2) Wright’s Forecasting for Profit, which is innocuous but thin. (3) 
Bratt’s Business Cycles and Forecasting, Third Edition (Richard D. Irwin, 
Inc., 1948) which I would recommend as the only adequate treatise of the 
three. 


REVIEW BY VICTOR VON SZELISKI 
Vice President, U.S. Economics Corporation 
10 Rockefeller Plaza, New York 20, N. Y. 


_ book is recommended to business managers and to professional and 
academic economists. It is not so much a description of forecasting de- 
vices as an inventory and broad description of those parts of the social sci- 
ences which are useful to practicing business economists. A more descriptive 
title would be “Role of Forecasting in Business Management and the Kit of 
Social Science Tools Needed by the Company Forecaster.” 

The author correctly regards forecasting as a practical art and craft, in 
this respect disassociating himself from the school which regards it as a de- 
partment of mathematics. Forecasting is trying “to understand and describe 
the specific results of economic and political action taken by human be- 
ings. ... The tools of the useful practicing economists are statistics of in- 
determinable accuracy, a knowledge of how people have acted in the past 
in situations which have been similar in some respects, a disciplined mind, 
knowledge derived as a result of critical observation, imagination, and com- 
mon sense.” So it is not surprising to find Mr. Wright recommending in 
the bibliography books by Pareto and Sorokin (but the reviewer does not in- 
tend to follow Mr. Wright’s bibliography to the extent of reading the book by 
Major Angas). ’ 
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The opening pages on the role of forecasting in business today and the po- 
sition of the forecaster-economist vis-A-vis management are a necessary in- 
troduction to any book on practical forecasting. 

In a brief paragraph on page 159, the author describes how forecasting has 
been integrated into the operations of the Armstrong Cork Compan, through 
a forecasting committee, how forecasting has been assimilated into the regu- 
lar operating routine. Perhaps Mr. Wright will expand this tantalizing 
paragraph at some future time and exp‘ain the mechanics of how forecasts 
are translated into action in purchasing, sales, plant construction, and fiscal 
management. 

The key forecast, in conformity with which an individual company’s 
decisions have to be worked out, is the sales forecast. The author takes 
up in Chapter 3 the prerequisites necessary for developing a sales fore- 
cast, emphasizing the distinction between internal or controllable factors, 
and external or uncontrollable factors. Analysis of the external factors con- 
trolling company sales usually shows that the general business cycle is the 
dominant factor in company sales. 

The first step in business cycle forecasting is making an “estimate of the 
situation,” which is an evaluation of the strains or unbalances currently 
existing and the direction in which the economy would normally move as a 
result of these strains. The reviewer’s description of the book as a list of 
needed tools rather than a manual on how to use them is illustrated by the 
fact that the difficult question of equilibrium and tension is described in less 
than 4 pages. On the basis of the short description given, the author’s 
method of measuring equilibrium (use of trend lines) appears too mechani- 
cal. Equilibrium levels can certainly shift in ways which are not adequately 
measured by trend lines, and which can be explained statistically only by 
taking account of a number of factors. 

The author’s use of expectations is interesting—the more so as reports 
on the expectations of business executives now appear in Fortune. 

More hints on diagnosing economic tension are given in Chapter 13. 
Maladjustments and disorganization within the supply of commodities are 
looked for; the magnitude of consumers’ stocks of durable commodities 
are to be appraised; their probable influence of prolonged trends are to be 
evaluated. Important recession and economic activity can be looked for 
when it is evident that excess productive capacity has been developed in 
a number of important industries. Residential construction is especially 
important because of the long, useful life of structures. “A large part of the 
success experienced in the practice of business forecasting may be attributed 
to competent analysis of the situation in, and outlook for, the construction 
industry, with particular emphasis on privately financed residential con- 
struction.” 

It is then up to the forecaster to figure out when and how fast and how far 
this tension will be released. 
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As a prerequisite for making useful, short-term forecasts, it is necessary 
to have a long-term forecast because the latter will condition the interpre- 
tation of the short-term occurrences and the prediction of the duration of 
short-term movements. The interpretation placed upon the expectation of 
business managers will also depend on the estimation of the long-term dis- 
equilibria which have to be corrected. 

Increases in labor costs and of material costs per unit of output may be the 
signal for the release of a tension by a decline in business activity. The au- 
thor also thinks that stock prices can be profitably analyzed for “measuring 
the effect of events on the expectations of persons who are highly interested 
in the trend of business activity and the magnitude of profits.” Maybe so. 
Mr. Wright regards the message of the stock market as especially important 
when the analysis of tensions indicates that a reversal of the trend of business 
activity is in the cards. 

The author devotes a few pages to economic models. “Being in the vogue, 
these instruments sometimes have been used to dignify illogical assumption 
with a mantle of precise arithmetic. Because of this practice, some experi- 
enced economists are inclined to lift their eyebrows at the use of these mathe- 
matical devices. They can be useful, however, for the purposes of persons 
who are interested in trying to forecast the development of actual situations 
in the economy.” The reviewer believes that the construction of economic 
models is absolutely essential for assuring one’s self that he has included (or 
for forcing one’s self to include) all major relevant considerations. A model 
should be thought of as a mosaic picture or a jig-saw puzzle—a matrix into 
which an economist can fit the thousand-odd clues collected for his work; 
the individual pieces do not mean very much by themselves but when fitted 
together they make a picture that is clearer than the mere sum of the parts. 
In trying to make them fit, the economist learns, and improves his estimate 
of the situation. At present, however, craftsmanship rather than science must 
be relied upon in this necessary task because the mathematical methods for 
model construction still imp badly. 

The author also has chapters on industry competition and price forecast- 
ing. Raw material prices and selling prices are of immediate interest to in- 
dustrial concerns, and so also are equity prices, as indications of how easily 
business can finance new capital expenditures with equity capital—matters 
which are of extreme importance in those stages of the business cycle where 
the forecast resolves itself intoa questionof how much business will spend for 
capital construction. 

The author scarcely discusses at all the opposite problem of how tension 
and disequilibria are created—how and why the economy moves from a posi- 
tion of equilibrium to one of tension. From the viewpoint of the science of 
economics, many if not most of the “why’s” are accidents; they are political 
or social in origin. 

Chapters 5 through 7 describe measures of national output, national in- 
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come, the makeup of the money supply, and some of the mechanisms that 
lead to changes in the money supply. Chapter 7 gives in detail the catalog 
of statistics in these and related factors, followed by the Armstrong Cork 
Company as background material from which forecasts are developed. 

The author then considers in Chapter 15 how the national business cycle 
forecast is translated into a forecast of industry and company sales—by 
correlation techniques, for example. As a preliminary step, the author makes 
a formal statement of the assumption used in making the various estimates—— 
assumptions drawn from analyses of the general economic pictures, as well 
as particular assumptions which have to be made about prices of specific 
commodities with which the company has to be concerned. After pre- 
liminary mathematical estimates of company and industrial activity have 
been set up, it is very important to refine these estimates by considering 
what deviations from the formula values are likely to obtain. “Guessing” 
what the residuals are going to be makes a high call on the craftsmanship of 
the economists. “The most accurate forecasting will be done by those who 
use judgment as well as mechanical aids in establishing estimated deviations. 
Personal judgment is required in making a useful estimate of the situation, 
and judgment again is necessary to establish the expected deviations from 
computed values when the correlations are used in forecasting for practical 
purposes.” 

Deviations from the national pattern may be traced to at least four causes 
which the professional forecester must evaluate carefully: (a) Secular change 
in acceptance of the commodity. (b) Change in customers’ inventories. (c) 
Errors due to inaccuracies in the statistics. (d) Inadequacy of the inde- 
pendent variable as a measure of the market. 

Among the most interesting pages in the book to a practical forecaster are 
the summaries or appraisals of economic situations of forecasts prepared for 
Armstrong Cork Company in 1931, 1932, 1936, 1937, and 1938. 
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THEORY OF STATISTICAL 
INDUCTION 


pees Kingston has_ kindly 
called my attention to the fact that 
certain of the errors mentioned in my 
review (this JouRNAL, March 1947) of 
his book (A Teoria da Inducdéo Estatisti- 
ca, 1945) had, inthe meantime, been cor- 
rected by the author in Revista Brasi- 
letra da Estatistica (No. 25, p. 56 and 
No. 26, p. 326). A rather complete 
corrigenda reached me after the re- 
view was submitted for publication. 

I should also like to take this oppor- 
tunity to clarify the impression that 
my review has made on some readers. 
The first paragraph was intended to 
make it clear that this book was better 
than some which presumably reputable 
publishers still add to the large English 
literature. Though I am not well 
acquainted with the field, it seems very 
likely that this is the best and most 
modern book in any of the Latin 
languages. 


JoHn W. Tuxsy, Assistant Pro- 
fessor of Mathematics, Princeton 
University, Princeton, N. J 


EXAMINATION OF INDUSTRIAL 
MEASUREMENTS 


N THE June 1947 issue of this Jour- 

NAL, two reviews of Examination of 
Industrial Measurements (McGraw- 
Hill Book Co., 1946) were presented by 
Professor C. W. Churchman and by 
Mr. Joseph Manuele and Mr. Roscoe 
Byers. The writer wishes to comment 
briefly on these reviews. 

Professor Churchman questions our 
reference to standard “betting odds” 
in limits recommended by the ASTM 
and the ASA. Here we are referring to 


three-sigma limits for indicating as- 


signable variations in control charts 


and fitted curves. Professor Church- 


man’s statement that “odds differ so 


widely depending on the product” ap- 


parently refers to allowable defectives 
or allowable quality variations in 
various products. We hope that he 


does not infer, for example, that the 


cement industry is more willing to 
take chances with its customers than 


is the textile industry, the steel in- 


dustry, or any other group of manu- 
facturers. 

Professor Churchman’s comment on 
sample sizes from 300 to 600 failed to 
mention that these were specified for 
fraction defective only. We agree that 
this estimate is “arbitrary” along with 
other such choices made in industrial 
practice. Actually, our recommenda- 
tion is based on a statement in L. E. 
Simon’s well-known Engineer’s Man- 


ual of Statistical Methods, and appro- 


priate reference was made in our text. 

We agree that the difference between 
sample estimates and true parameters 
should be emphasized, and our book 
follows the ASTM terminology in this 
respect. However, in industrial work 
we never know “true” values except 


as assumed standards. Formulae re- 


lated to the normal curve (or any other 
curve) are of no value to the engineer 
unless he applies them to actual data. 
In our experience, we have encountered 
more dangers due to the presence of 
assignably large variations than those 
due to sampling error, assuming that 


the sample sizes and procedures recom- 


mended for control charts are used. 
Professor Churchman’s review well 
illustrates the difference in viewpoint 


between the statistician and the prac- 


ticing engineer. Wartime applications 
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have shown that engineers and others 
of considerably less formal training can 
grasp and apply the “logical basis” of 
industrial statistical methods without 
knowing all of the mathematical 
derivations. ‘The definition of “logical 
basis” is a matter of degree, depend- 
ing upon whether such definition is 
that of the mathematician, the statis- 


tician, the engineer, or the plant 
foreman. . 
Mr. Manuele’s review states that 


“neither probability or risk is defined 
or mentioned” in the discussion of 
sampling. The three-sigma limit used 
throughout the book is certainly based 
on a very definite and strong prob- 
ability. The engineer who has con- 
structed a set of control charts from 
his own industrial measurements will 
thereby become acquainted with the 
sampling risks incurred, assuming that 
he has noted the table of odds listed 
on page 106 of the book. 

Although the writer is more inter- 
ested in applying proven methods than 
in developing new ones, our descrip- 
tion and use of median-normal limits 
and fraction non-random are, to the 
best of our knowledge, new. We shall be 
interested in references showing where 
these have been presented elsewhere. 

The detailed and helpful nature of 
these reviews is greatly appreciated 
by the writer. 


JoHn W. Duptey, Jr., Chemical 
Engineer, American Viscose Corp- 
oration, 35 South Ninth St., Phila- 
delphia 7, Pa. 


M: DvuDLEyY’s comments on my re- 
view point out very well the dif- 
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ficulties that inevitably arise in the 
application of formalized techniques to 
research and industrial problems. Such 
techniques are only means to certain 
objectives of the experimenter or in- 
dustrial engineer, and may become 
very bad means, despite their preci- 
sion, in cases where the conditions of 
their application are not satisfied. The 
writer of a practical manual like Mr. 
Dudley’s little volume is faced with 
the two-fold problem of describing a 
precise technique for handling meas- 
urements, and at the same time ad- 
vising the reader on the restrictions of 
application. No one could hope to 
accomplish this problem completely 
within the scope of the volume, or 
even one many times as large. If a re- 
viewer adds a few precautions of his 
own, this is meant to supplement in 
some part the author’s own comments. 
Three-sigma limits are not always ad- 
visable, no fixed sample size is uni- 
versally applicable, etc. The reader 
should be made aware as often as 
possible that the choices he makes 
result in operating characteristics, 
which form the real basis for his de- 
cisions. And I personally would think 
this was one of the most “practical” 
problems in industrial statistics; its 
solution does not demand a knowl- 
edge of mathematical derivations, but 
rather a knowledge of producer and 
consumer risks. A knowledge of con- 
sumer risks, at least, does take the in- 
dustrial engineer out of his field, but so 
much the better. 


C. West CHURCHMAN, Associate 
Professor of Philosophy, Wayne 
University, Detroit, Michigan. 
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1377. British Journal of Psychology: 
Statistical Section. Issued by the British 
Psychological Society. At least two issues 
annually; 30s. per volume of about 160 
pages; vol. 1, part 1 started October 1947. 
Edited by CYRIL BURT and GODFREY 
THOMSON. Published by University of 
London Press, Ltd., Warwick Square, 
London E.C.4, England. 

1378. Calcutta Statistical Association. 
Calcutta Stat Assn B (1):1-4 Ag ’47.* 

1379. Calcutta Statistical Association 
Bulletin. Four issues per year; Rs. 5 or 10a. 
per year; no. 1 is dated August 1947. P. K. 
BOSE, Secretary, Calcutta Statistical Asso- 
ciation, c/o Department of Statistics, Asu- 
tosh Building, Calcutta University, Calcutta 
7, India. 

1380. % Conference Papers: First Annuz' 
Convention, American Society for Quality 
Control and Second Midwest Quality Con- 
trol Conference, Hotel Sherman, Chicago, 
Illinois, Thursday-Friday, June 5 and 6, 
1947. Rochester 8, N. Y.: the Society (c/o 
Alfred L. Davis, Treas., Rochester Institute 
of Technology), 1947. Pp. vi, 284. Paper, 
lithotyped. $3.50. For contents, see 1431, 
1432, 1438, 1449, 1453, 1454, 1455, 1528, 
1562, 1650, 1665, 1691, 1712, 1714, 1718, 
1720, 1761, 1767, 1770, 1776, 1781, 1791, 
1847, 1848, 1853, 1861. 

1381. How Statistical Quality Control Is 
Being Applied at Timken-Detroit Plant. 
Automotive Ind 97(7):29+ 0 1 ’47.* 

1382. International Actuarial Notation. 
Trans Actuarial Soc Am 48(117):166-76 
My ’47.* 

1383. Mathematical Statistics and Econ- 


ometrics at the University of North Caro- 
lina. Econometrica 16(1) :125-6 Ja ’48.* 

1384. New Statistical Method of Pre- 
dicting Sunspots. Cur Sci 16(10) :306 O ’47.* 

1385. [Personnel and Training Problems 
Created by the Recent Growth of Applied 
Statistics in the United States.] Sci 106 
(2756) :391-2 O 24 ’47.* A summary of 1119. 

1386. Quality Control at Cadillac Based 
on Latest Methods. Automotive Ind 97(3): 
27+ Ag 1 ’47.* 

1387. Statistical Methods Simplify Eval- 
uation of Data. Electronics 20(11):194-+ N 
"47.* 

1388. The Social Use of Sample Surveys. 
Planning (250) :1-24 My 24 ’46.* 

1389. The Teaching of Statistics. En- 
gineering 164(4250) :38 Jl 11 ’47.* Summary 
of 1753. 

1390. The Teaching of Statistics: A Re- 
port of the Institute of Mathematical Sta- 
tistics Committee on the Teaching of Sta- 
tistics. HAROLD HOTELLING, Chair- 
man, WALTER BARTKY (Dean of the 
Division of tie Physical Sciences and Pro- 
fessor of Applied Mathematics, University 
of Chicago, Chicago, IIll.), W. EDWAKDS 
DEMING (Adviser in Sampling, Bureau of 
the Budget, Washington, D. C.), MILTON 
FRIEDMAN, and PAUL HOEL. Ann 
Math Stat 19(1):95-115 Mr ’48.* 

1391. What Is_a Sample? J Account 
83(6) :525 Je ’47.* 

1392. x ABBOTT, J. C. (Associate Pro- 
fessor of Mathematics), and BENAC, 
T. J. (Associate Professor of Mathematics). 
(United States Naval Academy, Annapolis, 
Md.) Principles of Counting and Probabil- 
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ity. Annapolis, Md.: U. 8. Naval Institute, 
1947. Pp. iii, 40. Paper. $1.00.* To be re- 


viewed. 

1393. ABRUZZI, ADAM (Industrial 
Engineering Consultant, 107 West 107th 
St., Shanks Village, Orangeburg, N. Y.). 
Personnel Records and Statistical Charts. 
Personnel 24(1) :46-54 Jl ’47.* 

1394. ALBERT, G. E. (Associate Pro- 
fessor of Mathematics, University of Ten- 
nessee, Knoxville, Tenn.). A Note on the 
Fundamental Identity of Sequential Analy- 
sis. Ann Math Stat 18(4) :593-6 D ’47.* 

1395. ALCHIAN, ARMEN 4A. (Assist- 
ant Professor of Economics, University of 
California, Los Angeles, Calif.). Part III, 
Analysis Procedures, pp. 443-509. In 
Records, Analysis, and Test Procedures. 
Edited by Walter L. Deemer, Jr. Army Air 
Forces. Aviation Psychology Program Re- 
search Reports, No. 18. Washington, D. C.: 
Government Printing Office, 1947. Pp. ix, 
621. $2.25.* 

1396. ANDERSON, J. ANSEL (Chief 
Chemist, Grain Research Laboratory, 
Board of Grain Commissioners for Canada, 
Winnipeg, Manitoba, Canada). The Prep- 
aration of Illustrations and Tables. Trans 
Am Assn Cereal Chem 3(2) :74-104 Ja ’45.* 

1397. ANDERSON, J. ANSEL. The 
Role of Statistics in Technical Papers. 
Trans Am Assn Cereal Chem 3(2):69-73 
Ja °45.* 

1398. ANDERSON, R. L. (Associate 
Professor of Experimental Statistics and 
Agricultural Economics, North Carolina 
State College, Raleigh, N. C.). Use of 
Variance Components in the Analysis of 
Hog Prices in Two Markets. J Am Stat 
Assn 42(240) :612-34 D ’47.* 

1399. ANDERSON, R. L., and AN- 
DERSON, T. W. Distribution of the Circu- 
lar Serial Correlation Coefficient for 
Residuals From a Fitted Fourier Series: 
Preliminary Report. Abstract. Ann Math 
Stat 19(1):119 Mr ’48.* 

1400. ANDERSON, THEODORE W. 
(Assistant Professor of Mathematical Sta- 
tistics, Columbia University, New York, 
N. Y.). and HURWICZ, LEONID (Re- 
search Associate Professor, Statistical Lab- 
oratory, and Associate Professor of Eco- 
nomics, Iowa State College, Ames, Iowa). 
Errors and Shocks in Economic Relation- 
ships. Abstract. Econometrica 16(1):36-7 
Ja ’48.* 

1401. ANSCOMBE, F. J. (Lecturer in 
Mathematics, University of Cambridge, 
Cambridge, England); GODWIN, H. J. 
(Lecturer in Mathematics, University Col- 
lege of Swansea, Swansea, Wales); and 
PLACKETT, R. L. Methods of Deferred 
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Sentencing in Testing the Fraction Defec- 
tive of a Continuous Output. Sup J Royal 
Stat Soc 9(2):198-217 '47.* 

1402. ARMITAGE, P. (Statistician, 
Medical Research Council, Department of 
Medical Statistics, London School of Hy- 
giene and Tropical Medicine, University of 
London, London, England). Some Se- 
quential Tests of Student’s Hypothesis. 
Sup J Royal Stat Soc 9(2) :250-63 ’47.* 

1403. ARMITAGE, P. A Comparison of 
Stratified With Unrestricted Random Sam- 
pling From a Finite Population. Biometrika 
34(3-4) :273-80 D '47.* 

1404. AROIAN, LEO A. (Assistant Pro- 
fessor of Mathematics, Hunter College, 
New York 21, N. Y.). Note on the Cumu- 
lants of Fisher’s z-Distribution. Biometrika 
34(3-4) :359-60 D °47.* 

1405. AROIAN, L. A., and DARKOW, 
MARGUERITE (Associate Professor of 
Mathematics, Hunter College, New York 
21, N. Y.). Fourth Degree Exponential 
Function. Abstract. Ann Math Stat 18(4): 
609 D ’47.* Also in B Am Math Soc 53(11): 
1128 N ’47.* 

1406. BAILEY, ARTHUR L. (Statis- 
tician, American Mutual Alliance, 60 East 
42nd St., New York 17, N. Y.). A Gen- 
eralized Theory of Credibility. Proc 
Casualty Actuarial Soc 32(62):13-20 N 16 
"45.* 

1407. BAILLIE, DONALD C. (Assist- 
ant Professor of Mathematics, University 
of Toronto, Toronto, Canada). Actuarial 
Note: On Testing the Significance of Mor- 
tality Ratios by the Use of x”. Discussion by 
THOMAS N. E. GREVILLE, pp. 541-3; 
GEORGE C. CAMPBELL, (Assistant Ac- 
tuary, Metropolitan Life Insurance Co., 
New York 10, N. Y.), pp. 548-6; CHAL- 
MERS L. WEAVER, pp. 546-52; and 
HAROLD CRAMER, p. 552. Trans Actu- 
arial Soc Am 47(116, 116S) :326-444, 541- 
61 O ’46.* 

1408. BAINBRIDGE, J. R. (Common- 
wealth Research Officer, Mechanical En- 
gineering Department, University of Mel- 
bourne, Melbourne, Australia). The Teach- 
ing of Statistics. Letter. Engineering 
164(4267) :448 N 7 °47.* 

1409. BAKER, G. A. (Assistant Profes- 
sor of Mathematics and Assistant Statis- 
tician in the Experiment Station, Uni- 
versity of California, Davis, Calif.). The 
Effect of Selection Above Definite Lower 
Limits of Linear Functions of Normally 
Distributed Correlated Variables on the 
Means and Variances of Other Linear 
Functions. Abstract. Ann Math Stat 
19(1):118 Mr °48.* 

1410. BALDWIN, ALFRED L. (Asso- 
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ciate Professor of Psychology, and Re- 
search Associate, Samuel S. Fels Research 
Institute, Antioch College, Yellow Springs, 
Ohio). The Study of Individual Personality 
by Means of the Intraindividual Correla- 
tion. J Personality 14(3):151-68 Mr ’46.* 

1411. BARANKIN, E. W. (Instructor in 
Mathematics, University of California, 
Berkeley, Calif.). Independence of Parame- 
ters and Sufficient Statistics. Abstract. 
Ann Math Stat 19(1):118-9 Mr ’48.* 

1412. BARNETT, M. K. (Assistant Pro- 
fessor of Chemistry, University of Dayton, 
Dayton, Ohio). The Factorial Experiment 
in Engineering Research. American Insti- 
tute of Mining and Metallurgical Engineers, 
Technical Publication No. 2161. Metals 
Technol 14(4):1-12 Je ’47.* 

1413. BARTLETT, M. S. (Professor of 
Mathematical Statistics, University of 
Manchester, Manchester, England). The 
General Canonical Correlation Distribution. 
Ann Math Stat 18(1):1-17 Mr ’47.* 

1414. BARTLETT, M. S. Multivariate 
Analysis. Sup J Royal Stat Soc 9(2) :176-90, 
discussion 190-7 '47.* 

1415. BATES, EDMOND E. (Bates & 
Boswell, Management Engineers, 1709 
West Eighth St., Los Angeles 14, Calif.). 
How to Increase Tolerances and Obtain 
Closer Fits. Iron Age 160(1) :58-61 JI 3 ’47.* 

1416. BAUMGART, ERNEST L. M. 
(Laboratoire de Physiologie des Sensations, 
Coliége de France, Paris, France). The 
Quantic and Statistical Bases of Visual Ex- 
citation. J General Physiol 31(3):269-90 
Ja 20 *48.* 

1417. BEARD, R. E. (Assistant Fire 
Manager, Pearl Assurance Co. Ltd., High 
Holborn, London W.C.1, England). The 
Standard Deviation of the Distribution of 
Sickness. J Inst Actuaries Students’ Soc 
7(1) :23-8 Jl ’47.* 

1418. BEARD, R. E. Statistical Prob- 
lems in Naval Aircraft Provisioning. J Inst 
Actuaries Students’ Soc 6(3):144-8 Ja '47.* 

1419. BEERS, HENRY S. (Vice-Presi- 
dent, Aetna Life Insurance Co., Hartford, 
Conn.). Modified-Interpolation Formulas 
That Minimize Fourth Differences. Discus- 
sion by WILMER A. JENKINS (Vice- 
President and Actuary, Teachers Insurance 
and Annuity Association, New York 18, 
N. Y.), pp. 184-5; and T. N. E. GRE- 
VILLE, pp. 185-6. Ree Am Inst Actuaries 
34(69, 70) :14-20, 184-7 Je, N °45.* 

1420. BEERS, HENRY S. Premium 
Interpolation. Trans Actuarial Soc Am 
48(117) :53-75 My °47.* 

1421. BELZ, MAURICE H. (Associate 
Professor of Mathematics, University of 
Melbourne, Melbourne, Australia). Note on 
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the Liapounoff Inequality for Absolute 
Moments. Ann Maih Stat 18(4):604-5 D 
"47.* 

1422. BENNETT, W. R. (Research 
Physicist, Bell Telephone Laboratories, 
Murray Hill, N. J.). Distribution of the 
Sum of Randomly Phased Components, 
Q Appl Math 5(4) :385-93 Ja ’48.* 

1423. BERKELEY, EDMUND OC. 
(Chief Research Consultant, General Office 
Administration Department, Prudential 
Insurance Co., Newark 1, N. J.). Electronic 
Machinery for Handling Information, and 
Its Uses in Insurance. Discussion by 
WILLIAM P. BARBER, JR. (Secretary, 
Connecticut Mutual Life Insurance Co., 
Hartford, Conn.), pp. 278-80; EDWARD 
H. WELLS (Assistant Actuary, Mutual 
Life Insurance Co., New York 5, N. Y.), 
pp. 280-1; and EDWARD A. RIEDER 
(Associate Actuary, Mutual Life Assurance 
Company of Canada, Waterloo, Ontario, 
Canada), pp. 282-4. Trans Actuarial Soc 
Am 48(117, 118) :36-52, 278-88 My, O’ 47.* 

1424. BHAT, N. R. (Department of 
Genetics, University of Cambridge, Cam- 
bridge, England). An Improved Genetical 
Map of Punnett’s ‘B’ Chromosome in the 
Sweet Pea, Lathyrus Odoratus L. J Ge- 
netics 48(3) :343-58 Ja ’48.* 

1425. BHATTACHARYYA, A. (Statis- 
tical Laboratory, Presidency College, Cal- 
cutta, India). On Some Analogues of the 
Amount of Information and Their Use in 
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18 O '47.* 

1426. BICKING, CHARLES A. (Qual- 
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Wilmington 99, Del.). Quality Control for 
the Pulp and Paper Industry. Paper Trade 
J 124(17) :638-7 Ap 24 ’47.* 

1427. BIRNBAUM, Z. W. (Associate 
Professor of Mathematics, University of 
Washington, Seattle, Wash.). On Random 
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Ann Math Stat 1/1) :76-81 Mr ’48.* 

1428. BIZLEY, M. T. L. (Technical 
Assistant, Bacon and Woodrow, Consulting 
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Waring’s B-Formula. J Inst Actuaries 
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1429. BLAXTER, K. L. (Biochemistry 
Department, Ministry of Agriculture and 
Fisheries, Veterinary Laboratory, Wey- 
bridge, Surrey, England). The Evaluation 
of the Nutritive Value of Animal Feeding- 
Stuffs: The Application of Statistical 
Methods to Food Problems. Analyst 
73(862):11-5 Ja ’48.* Abstract Chem & 
Ind (3) :43-4 Ja 18 '47.* 
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1430. BOCHNER, SOLOMON (Profes- 
sor of Mathematics, Princeton University, 
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Ann Math 48(4):1014-61 O °47.* 

1431. BOLANOVICH, D. J. (Personnel 
Research Analyst, Personnel Department, 
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Conference Papers: First Annual Convention 
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(see 1380). 
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ray Hill, N. J.). A Simple Procedure for the 
Making of Alignment Charts. J Appl 
Physics 19(1) :83-6 Ja ’48.* 

1434. BOSE, P. K. (Lecturer in Satis- 
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Function Populations Associated With the 
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khya 8(3) :235-48 O '47.* 

1435. BOSE, R. C. (Calcutta University, 
Calcutta, India). On a Resolvable Series of 
Balanced Incomplete Block Designs. San- 
khya 8(3) :249-56 O ’47.* 

1436. BOWEN, G. M. (General Inspec- 
tor, Feeder Division, Westinghouse Electric 
Corp., East Pittsburgh, Pa.). Lot Sampling 
of Screw Machine Parts. Iron Age 161(5): 
74-6 Ja 29 ’48.* 

1437. BOWKER, ALBERT H. (Assist- 
ant Professor of Mathematical Statistics, 
Stanford University, Stanford, Calif.). 
Chap. 2, Tolerance Limits for Normal 
Distributions, pp. 95-110. In Selected Tech- 
niques of Statistical Analysis (see 1512). 

1438. BOYAN, EDWIN A. (Assistant 
Professor of Business Management, Massa- 
chusetts Institute of Technology, Cam- 
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